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TRANSGENIC PLANTS EXPRESSING PHOTORHABDUS TOXIN 

BACKGROUND OF THE INVENTION 
As reported in WO98/08932, protein toxins from the 
genus- Photorhabdus have been shown to have oral toxicity 
5 against insects. The toxin complex produced by 

Photorhabdus luminescens (W-14), for example, has been 
shown to contain ten to fourteen proteins, and it is 
known that these are produced by expression of genes from 
four distinct genomic regions: tea, tcb, tec, and ted. 
10 WO98/08932 discloses nucleotide sequences for the native 
toxin genes. 

Of the separate toxins isolated from Photorhabdus 
luminescens (W-14), those designated Toxin A and Toxin B 
are especially potent against target insect species of 

15 interest, for example corn rootworm. Toxin A is 

comprised of two different subunits. The native gene 
tcdA (SEQ ID NO:l) encodes protoxin TcdA (see SEQ ID 
NO:l). As determined by mass spectrometry, TcdA is 
processed by one or more proteases to provide Toxin A. 

20 More specifically, TcdA is an approximately 282.9 kDA 

protein (2516 aa) that is processed to provide TcdAii, an 
approximately 208.2 kDA (1849 aa) protein encoded by 
nucleotides 265-5811 of SEQ ID NO:l, and TcdAiii, an 
approximately 63.5 kDA (579 aa) protein encoded by 

25 nucleotides 5812-7551 of SEQ ID NO:l. 

Toxin B is similarly comprised of two different 
subunits. The native gene tcbA (SEQ ID NO: 2) encodes 
protoxin TcbA (see SEQ ID NO:2). As determined by mass 
spectrometry, TcbA is processed by one or more proteases 

30 to provide Toxin B. More specifically, TcbA is an 
approximately 280.6 kDA (2504 aa) protein that is 
processed to provide TcbAii, an approximately 207.7 kDA 
(1844 aa) protein encoded by nucleotides 262-5793 of SEQ 
ID NO: 2 and TcbAiii, an approximately 62.9 kDA (573 aa) 

35 protein encoded by nucleotides 5794-7512 of SEQ ID NO: 2. 
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The native tcdA and tcbA genes are not well suited 
for high level expression in plants. They encode* 
multiple destabilization sequences, mRNA splice sites, 
polyA addition sites and other possibly detrimental 
5 sequence motifs- In addition, the codon compositions are 
not like those of plant genes. WO98/08932 gives general 
guidance on how the toxin genes could be reengineered to 
more efficiently expressed in the cytoplasm of plants, 
and describes how plants can be transformed to 
10 incorporate the Photorhabdus toxin genes into their 
genomes. 

SUMMARY OF THE INVENTION 
In a preferred embodiment, the invention provides 
novel polynucleotide sequences that encode TcdA and TcbA. 

15 The novel sequences have base compositions that differ 
substantially from the native genes, making them more 
similar to plant genes. The new sequences are suitable 
for use for high expression in both monocots and dicots, 
and this feature is designated by referring to the 

20 sequences as the "hemicot" criteria, which is set forth 
in detail hereinafter. Other important features of the 
sequences are that potentially deleterious sequences have 
been eliminated, and unique restriction sites have been 
built in to enable adding or changing expression 

25 elements, organellar targeting signals, engineered 
protease sites and the like, if desired. 

In a particularly preferred embodiment, the 
invention provides polynucleotide sequences that satisfy 
hemicot criteria and that comprise a sequence encoding an 

30 endoplasmic reticulum signal or similar targeting 

sequence for a cellular organelle in combination with a 
sequence encoding TcdA or TdbA. 

More broadly, the invention provides engineered 
nucleic acids encoding functional Photorhabdus toxins 

35 wherein the sequences satisfy hemicot criteria. 
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The invention also provides transgenic plants with 
genomes comprising a novel sequence of the invention that 
imparts functional activity against insects. 

BRIEF DESCRIPTION OF SEQUENCES 
SEQ ID NO:l is the native tcdA DNA sequence together 
with the corresponding encoded amino acid sequence for 
TcdA. 

SEQ ID NO: 2 is the native tcbA DNA sequence together 
with the corresponding encoded amino acid sequence for 
TcbA. 

SEQ ID NO: 3 is an artificial sequence encoding TcdA 
that is suitable for expression in monocot and dicot 
plants. 

15 SEQ ID NO: 4 is an artificial sequence encoding TdbA 

that is suitable for expression in monocot and dicot 
plants. 

SEQ ID NO: 5 is an artificial hemicot sequence that 
encodes the 21 amino acid ER signal peptide of 15 kDa 
20 zein from Black Mexican Sweet maize. 

SEQ ID NO: 6 is an artificial hemicot sequence that 
encodes for the full-length native TcdA protein (amino 
acids 22-2537) fused to the modified 15 kDa zein 
endoplasmic reticulum signal peptide (amino acids 1-21) . 
25 DETAILED DESCRIPTION 

The native Photorhabdus toxins are protein complexes 
that are produced and secreted by growing bacteria cells 
of the genus Photorhabdus. Of particular interest are 
the proteins produced by the species Photorhabdus 
luminesceris. The protein complexes have a molecular size 
of approximately 1,000 kDa and can be separated by SDS- 
PAGE gel analysis into numerous component proteins. The 
toxins contain no hemolysin, lipase, type C 
phospholipase, or nuclease activities. The toxins 
35 exhibit significant toxicity upon ingestion by a number 
of insects. 
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A unique feature of Photorhabdus is its 
bioluminescence. Photorhabdus may be isolated from a 
variety of sources. One such source is nematodes, more 
particularly nematodes of the genus tfeterorhajbditis. 
5 Another such source is from human clinical samples from 
wounds, see Farmer et al. 1989 J. Clin. Microbiol. 27 pp. 
1594-1600. These saprohytic strains are deposited in the 
American Type Culture Collection (Rockville, MD) ATCC #s 
43948, 43949, 43950, 43951, and 43952, and are 

10 incorporated herein by reference. It is possible that 
other sources could harbor Photorhabdus bacteria that 
produce insecticidal toxins. Such sources in the 
environment could be either terrestrial or aquatic based. 
The genus Photorhabdus is taxonomically defined as a 

15 member of the Family Enterobacteriaceae, although it has 
certain traits atypical of this family. For example, 
strains of this genus are nitrate reduction negative, 
yellow and red pigment producing and bioluminescent . 
This latter trait is otherwise unknown within the 

20 Enterobacteriaceae. Photorhabdus has only recently been 
described as a genus separate from the Xenorhabdus 
(Boemare et al., 1993 Int. J. Syst. Bacteriol. 43, 249- 
255) . This differentiation is based on DNA-DNA 
hybridization studies, phenotypic differences (e.g., 

25 presence (Photorhabdus) or absence (Xenorhabdus) of 
catalase and bioluminescence) and the Family of the 
nematode host (Xenorhabdus; Steinernematidae, 
Photorhabdus; Heterorhabditidae) . Comparative, cellular 
fatty-acid analyses (Janse et al. 1990, Lett. Appl. 

30 Microbiol 10, 131-135; Suzuki et al. 1990, J. Gen. Appl. 
Microbiol., 36, 393-401) support the separation of 
Photorhabdus from Xenorhabdus. 

Currently, the bacterial genus Photorhabdus is 
comprised of a single defined species, Photorhabdus 

35 luminescens (ATCC Type strain #29999, Poinar et al., 

1977, Nematologica 23, 97-102). A variety of related 

-4- 
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strains have been described in the literature (e.g., 
Akhurst et al. 1988 J. Gen. Microbiol., 134, 1835-1845; 
Boemare et al. 1993 Int. J. Syst. Bacterid. 43 pp. 249- 
255; Putz et al. 1990, Appl. Environ. Microbiol., 56, 
5 181-186) . 

The following toxin producing Photorhabdus strains 
have been deposited: 
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s t r a in 


accession number 


date of deposit j 


rt X *a 


ATCC 55397 


March 5, 1993 


WY 1 


NRRL B-21710 


April 29, 1997 


WX2 


NRRL B-21711 


April 29, 1997 


wy ^ 


NRRL B-21712 


April 29, 1997 


MY 4 


NRRL B-21713 


April 29, 1997 


WY R 


NRRL B-21714 


April 29, 1997 


rY A. U 


NRRL B-21715 


April 29, 1997 


WYT 
ri A. / 


NRRL B-21716 


April 29, 1997 


WX8 


NRRL B-21717 


April 29, 1997 \ 


WX9 


NRRL B-21718 


April 29, 1997 


WX10 


NRRL B-21719 


April 29, 1997 


WXll 


NRRL B-21720 


April 29, 1997 


WX12 


NRRL B-21721 


April 29, 1997 


WX14 


NRRL B-21722 


April 29, 1997 


WX15 


NRRL B-21723 


April 29, 1997 




NRRL B-21727 


April 29, 1997 


Hb 


NRRL B-21726 


April 29, 1997 


Hm 


NRRL B-21725 


April 29, 1997 


HP88 


NRRL B-21724 


April 29, 1997 


Kir*— i 


NRRL B-21728 


April 29, 1997 




NRRL B-21729 


April 29, 1997 


WTR 

WX i\ 


NRRL B-21730 


April 29, 1997 


no 


NRRL B-21731 


April 29, 1997 


ZXTTT 41Q4R 

nlL>L> 1J > 1 O 


ATCC 55878 


November 5, 1996 


RTfT 4*594 9 


ATCC 55879 


November 5, 1996 . 


ZiTPP 4 "^Q^D 


ATCC 55880 


November 5, 1996 


nlUU J J7J1 


ATCC 55881 


November 5, 1996 




ATCC 55882 


November 5, 1996 


nrpT 
uLr x 


NRRL B-21707 


April 29, 1997 




NRRL B-21708 


April 29, 1997 j 




NRRL B-21709 


April 29, 1997 




NRRL B-21683 


April 29, 1997 




NRRL B-21684 


April 29, 1997 


HR-Arn 


NRRL B-21685 


April 29, 1997 


no uqwci^v 


NRRL B-21686 


April 29, 1997 


Hh T 1 pwi e ?t*on 


NRRL B-21687 


April 29, 1997 


K-122 


NRRL B-21688 


April 29, 1997 


HMGD 


NRRL B-21689 


April 29, 1997 


TnH i f"Mi ^ 

X 1 1U1 \-> U w 


NRRL B-21690 


April 29, 1997 




NRRL B-21691 


April 29, 1997 


PWH-5 


NRRL B-21692 


April 29, 1997 


Man t Hi Q 


NRRL B-21693 


April 29, 1997 


HF_85 


NRRL B-21694 


April 29, 1997 


A Cows 


NRRL B-21695 


April 29, 1997 


MP1 


NRRL B-21696 


April 29, 1997 


MP2 


NRRL B-21697 


April 29, 1997 


MP3 


NRRL B-21698 


April 29, 1997 


MP4 


NRRL B-21699 


April 29, 1997 


MPS 


NRRL B-21700 


April 29, 1997 


GL98 


NRRL B-21701 


April 29, 1997 


G1101 


NRRL B-21702 


April 29, 1997 


GL138 


NRRL B-21703 


April 29, 1997 


GL155 


NRRL B-21704 


April 29, 1997 


GL217 


NRRL B-21705 


April 29, 1997 


GL257 


NRRL B-21706 


April 29, 1997 



All strains were deposited in accordance with the 



terms of the Budapest Treaty- Strains having 
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accession numbers prefaced by w ATTC" were deposited 
on the indicated date in the American Type Culture 
Collection, 12301 Parklawn Drive, Rockville, MD 
20852 USA. Strains prefaced by "NRRL" were 
5 deposited on the indicated date in the Agricultural 
Research Service Patent Culture Collection (NRRL) , 
National Center for Agricultural Utilization 
Research, ARS-USDA, 1815 North University St., 
Peoria IL 61604 USA. 
10 The present invention provides hemicot nucleic acid 

sequences encoding toxins from any Photorhabdus species 
or strain that produces a toxin having functional 
activity. Hemicot nucleic acid sequences encoding 
proteins homologous to such toxins are also encompassed 
15 by the invention. 

Several terms that are used herein have a particular 
meaning and are defined as follows: 

By "functional activity" it is meant herein that the 
protein toxins) function as insect control agents in that 
20 the proteins are orally active, or have a toxic effect, 

or are able to disrupt or deter feeding, which may or may 
not cause death of the insect. When an insect comes into 
contact with an effective amount of toxin delivered via 
transgenic plant expression, formulated protein 
25 compositions), sprayable protein compositions), a bait 
matrix or other delivery system, the results are 
typically death of the insect, or the insects do not feed 
upon the source which makes the toxins available to the 
insects . 

30 B y "homolog" it is meant an amino acid sequence that 

is identified as possessing homology to a reference 
Photorhabdus toxin polypeptide amino acid sequence. 

By "homology" it is meant an amino acid sequence 
that has a similarity index of at least 33% and/or an 
35 identity index of at least 26% to a reference 

Photorhabdus toxin polypeptide amino acid sequence, as 

-7- 
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scored by the GAP algorithm using the BlOsum 62 protein 
scoring matrix Wisconsin Package Version 9.0, Genetics 
Computer Group GCG) , Madison, WI) . 

By " identity" is meant an amino acid sequence that 
5 contains an identical residue at a given position, 

following alignment with a reference Photrhabdus toxin 
polypeptide amino acid sequence by the GAP algorithm. 

By the use of the term " Photorhabdus toxin" it is 
meant any protein produced by a Photorhabdus 
10 microorganism strain which has functional activity 

against insects, where the Photorhabdus toxin could be 
formulated as a sprayable composition, expressed by a 
transgenic plant, formulated as a bait matrix, delivered 
via baculovirus, or delivered by any other applicable 
15 host or delivery system. 

By the use of the term "toxic" or "toxicity" as used 
herein it is meant that the toxins produced by 
Photorhabdus have "functional activity" as defined 
herein. 

20 By "substantial sequence homology" is meant either: 

a DNA fragment having a nucleotide sequence sufficiently 
similar to another DNA fragment to produce a protein 
having similar biochemical properties; or a polypeptide 
having an amino acid sequence sufficiently similar to 

25 another polypeptide to exhibit similar biochemical 
properties . 

As with other bacterial toxins, the rate of mutation 
of the bacteria in a population causes many related 
toxins slightly different in sequence to exist. Toxins 

30 of interest here are those which produce protein 

complexes toxic to a variety of insects upon exposure, as 
described herein. Preferably, the toxins are active 
against Lepidoptera, Coleoptera, Homopotera, Diptera, 
Hymenoptera, Dictyoptera and Acarina. The inventions 

35 herein are intended to capture the protein toxins 

homologous to protein toxins produced by the strains 

-8- 



WO 01/11029 



PCT/USOO/22237 



herein and any derivative strains thereof, as well as any 
protein toxins produced by Photorhabdus. These 
homologous proteins may differ in sequence, but do not 
differ in function from those toxins described herein. 
5 Homologous toxins are meant to include protein complexes 
of between 300 kDa to 2,000 kDa and are comprised of at 
least two 2) subunits, where a subunit is a peptide which 
may or may not be the same as the other subunit. Various 
protein subunits have been identified and are taught in 
10 the Examples herein. Typically, the protein subunits are 
between about 18 kDa to about 230 kDa; between about 160 
kDa to about 230 kDa; 100 kDa to 160 kDa; about 80 kDa to 
about 100 kDa; and about 50 kDa to about 80 kDa. 

As discussed above, some Photorhabdus strains can be 
15 isolated from nematodes. Some nematodes, elongated 

cylindrical parasitic worms of the phylum Nematoda, have 
evolved an ability to exploit insect larvae as a favored 
growth environment. The insect larvae provide a source 
of food for growing nematodes and an environment in which 
20 to reproduce. One dramatic effect that follows invasion 
of larvae by certain nematodes is larval death. Larval 
death results from the presence of, in certain nematodes, 
bacteria that produce an insecticidal toxin which arrests 
larval growth and inhibits feeding activity. 
25 Interestingly, it appears that each genus of insect 

parasitic nematode hosts a particular species of 
bacterium, uniquely adapted for symbiotic growth with 
that nematode. In the interim since this research was 
initiated, the name of the bacterial genus Xenorhabdus 
30 was reclassified into the Xenorhabdus and the 

Photorhabdus. Bacteria of the genus Photorhabdus are 
characterized as being symbionts of Heterorhabditus 
nematodes while Xenorhabdus species are symbionts of the 
Steinernema species. This change in nomenclature is 
35 reflected in this specification, but in no way should a 
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change in nomenclature alter the scope of the inventions 
described herein. 

The peptides and genes that are disclosed herein are 
named according to the guidelines recently published in 
5 the Journal of Bacteriology w Instructions to Authors" p. 
i-xii Jan. 1996), which is incorporated herein by 
reference . 

Transformation methods useful in carrying out the 
invention are well known, and are described, for example, 
10 in WO98/08932. 

Hemicot tcdA and tcbA 

SEQ ID NO: 3 is the nucleotide sequence for an 
engineered tcdA gene in accordance with the invention. 
SEQ ID NO: 4 is the nucleotide sequence for an engineered 
15 tcbA gene in accordance with the invention. 

The following Tables 1 and 2 identify significant 
features of the engineered tcdA and tcbA genes. 



Table 1 
tcdA 



Feature 


nucleotides of SEQ ID NO: 3 


Ncol 


1-6 


Hindi I I 


48-53 


Kpnl 


246-254 


sequence encoding 
TcbAii 


267-5798 


Nhel 


333-338 


Bglll 


1215-1220 


Clal 


2604-2609 j 


PstI 


4015-4020 


Agel 


5088-5093 


Muni 


5598-5603 


Xbal 


5778-5783 


sequence encoding 
TcbAiii 


5799-7517 


A fill 


5853-5858 


Sphl 


6439-6444 | 


Sful 


7392-7397 


Sacl 


7519-7524 


Xhol . 


7522-7527 1 


Stul 


7528-7533 


Notl 


7533-7538 


Table 2 


Feature 


nucleotides of SEQ ID NO: 5 


Ncol 


1-6 


Hindi I I 


48-53 
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Kpnl 


246-251 


sequence encoding 
TcbAii 


267-5798 


Nhel 


333-338 


Bglll 


1215-1220 


Clal 


2604-2609 


PstI 


4015-4020 j 


Agel 


5088-5093 


Muni 


5598-5603 


Xbal 


5778-5783 


sequence 
encodingTcbAiii 


5799-7517 


A fill 


5853-5858 j 






Sful 


7392-7397 


Sacl 


7519-7524 


Sful 


7392-7397 


SacI 


7519-7524 


Xhol 


7522-7527 


StuI 


7528-7533 


Notl 


7535-7540 



It should be noted that the proteins encoded by the 
plant-optimized tcdA (SEQ ID NO: 3) and tcbA (SEQ ID 
NO: 5) differ from the native proteins by the addition of 
5 an Ala residue at position #2. This modification was 
made to accommodate the Ncol site which spans the ATG 
start codon. 

The following Table 3 compares the codon composition of 
10 the engineered tcdA gene of SEQ ID NO: 3 and engineered 
tcbA gene of SEQ ID NO: 5 with the codon compositions of 
the native genes, the typical dicot genes, and maize 
genes. 



Table 3 



amino 


codon 


% in 


% in 


% in 


% in 


% in 


% in 


acid 




SEQ 


tcdA 


SEQ 


tab A 


dicot 


maize 






ID 




ID . 












N0:3 




NO: 5 








Ala 


GCT 


62 


21 


69 


41 


42 


24 




GCC 


26 


32 


27 


17 


27 


34 




GCA 


11 


25 


4 


22 


25 


18 




GCG 


0 


21 


0 


21 


6 


24 


Arg 


AGG 


48 


0 


60 


2 


25 


26 




CGC 


22 


36 


18 


16 


11 


24 




AGA 


20 


11 


15 


6 


30 


15 




CGT 


11 


39 


7 


57 


21 


11 




CGG 


0 


7 


0 


13 


4 


15 




CGA 


0 


8 


0 


6 


8 


9 


Asn 


AAC 


100 


32 


100 


33 


55 


68 




AAT 


0 


68 


0 


67 


45 


32 


Asp 


GAC 


67 


22 


70 


25 


42 


63 J 
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amino 


codon 


% in 


% in 


% in 


% in 


% in 


% in 


acid 




SEQ 


ted A 


SEQ 


tcbA 


dicot 


maize 






ID 




ID 












NO: 3 




NO: 5 










GAT 


33 


78 


30 


75 


58 


37 


Cys 


TGC 


100 


30 


100 


19 


56 


68 ! 


TGT 


0 


70 


0 


81 


44 


32 


End 


TGA 


100 


0 


100 


0 


33 


59 




TAG 


0 


0 


0 


0 


19 


21 




TAA 


0 


100 


0 


100 


48 


20 


Gin 


CAA 


65 


61 


74 


53 


59 


38 




CAG 


35 


39 


26 


47 


41 


62 


Glu 


GAG 


100 


24 


98 


36 


51 


71 




GAA 


0 


76 


2 


64 


49 


29 


Gly 


GGT 


67 


37 


64 


44 


33 


20 


GGC 


32 


36 


36 


22 


16 


42 




GGA 


1 


20 


0 


19 


38 


19 




GGG 


0 


8 


0 


16 


12 


20 


His 


CAC 


62 


40 


72 


31 


4 6 


62 




CAT 


38 


60 


28 


69 


54 


38 


lie 


ATC 


73 


34 


65 


24 


37 


58 




ATT 


27 


51 


35 


59 


45 


28 




ATA 


0 


15 


0 


17 ! 


18 


14 


Leu 


CTC 


54 


11 


59 


7 


28 


26 




TTG 


29 


17 


25 


32 


26 


15 




CTT 


16 


9 


15 


7 


19 


17 




TTA 


0 


18 


0 


19 


10 


5 




CTG 


0 


32 


0 


29 


9 


29 




CTA 


0 


13 


0 


7 


8 


8 


Lys 


AAG 


99 


79 


99 


75 


61 


78 


AAA 


1 


21 


1 


25 


39 


22 


Met 


ATG 


100 


100 


100 


100 


100 


100 


Phe 


TTC 


100 


42 


100 


41 


55 


71 




TTT 


0 


58 


0 


59 


45 


29 


Pro 


CCA 


74 


30 


91 


26 


42 


26 




CCT 


22 


28 


7 


20 


32 


22 




CCC 


4 


14 


3 


7 


17 


24 




CCG 


0 


27 


0 


47 


9 


28 


Ser 


TCC 


47 


19 


55 


11 


18 


23 




TCT 


35 


15 


30 


15 


25 


15 




AGC 


18 


22 


15 


18 


18 


23 




AGT 


0 


20 


0 


31 


14 


9 




TCG 


0 


7 


0 


8 


6 


14 




TCA 


0 


17 


0 


17 


19 


16 


Thr 


ACC 


60 


41 


64 


31 


30 


37 




ACT 


28 


25 


32 


34 


35 


20 } 




ACA 


12 


21 


4 


18 


27 


21 




ACG 


0 


13 


0 


18 


8 


22 


Trp 


TGG 


100 


100 


100 


100 


100 


100 


Tyr 


TAC 


100 


24 


100 


19 


57 


73 


TAT 


0 


76 


0 


81 


43 


27 


Val 


GTC 


69 


27 


73 


11 


20 


31 




GTG 


21 


17 


22 


27 


29 


39 




GTT 


10 


34 


3 


48 


39 


21 




GTA 


0 


22 


2 


14 


12 


8 



EXAMPLE 1 

Design Of Plant Codon-Biased Genes Encoding W-14 Peptides 
5 TcbA and TcdA 

A. Gene Design 
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The coding strands of the native DNA sequences of the 
Photorhabdus W-14 genes encoding peptides TcbA and TcdA 
were scanned for the presence of deleterious sequences 
such as the Shaw/Kamen RNA destabilizing motif ATTTA, 
intron splice recognition sites, and poly A addition 
motifs. This was done using the MacVector Sequence 
Analysis Software (Oxford Molecular Biology Group, 
Symantec Corp.), using a custom Nucleic Acid Subsequence 
File. The native sequence was also searched for runs of 
4 or more of the same base. 

Motif searching of the native W-14 tcbA and tcdA 
genes revealed the presence of many potentially 
deleterious sequences in the protein coding strands, as 
summarized in Table 4. Not shown, but also present, were 
many runs of four or more single residues (e.2- the 
native tcbA gene has 81 runs of four A's). 



Native 
Gene 


ATTTA 


5' Splice 


3* Splice 


Poly A 
Addition* 


RNAP II term. 


tcbA 


18 


7 


17 


46 


0 


tcdA 

* T 1 


18 


7 


13 


77 


1 



Analyses of eukaryotic genes and plant genes in 
particular have shown that CG & TA doublets are 
underrepresented, while the genes are enriched in CT & TG 
doublets. The sequences of the hemicot biased genes have 
accordingly been adjusted to encompass these base 
compositions and to have G+C compositions of about 53%, 
similar to many plant genes. When compared to the native 
W-14 tcbA and tcdA genes, the plant-biased genes have a 
much more uniform G+C distribution. 

Nucleotide changes to remove potentially deleterious 
sequences were chosen to simultaneously adjust the codon 
composition of the coding region to more closely reflect 
that of plant genes. A framework for these changes was 
provided by the codon bias tables prepared for maize and 
dicot genes shown in Table 3. 
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Comparison of codoh compositions of the native W-14 
. genes to maize and dicot genes revealed that the W-14 
genes contain a very different preference set of the 
degenerate codons for the 18 amino acids for which there 
5 is a choice (Table 3). For each of 8 amino acids (Phe, 
Tyr, Cys, Arg, Asn, Lys, Glu, and Gly) in both W-14 
genes, the most abundant codon is different from the 
preferred codons found in either maize or dicot genes. 
One might expect that translational difficulties would be 

10 encountered in efforts to produce in plants proteins 

(such as TcbA and TcdA) having high relative amounts of 
these amino acids from mRNAs having large numbers of 
nonpref erred codons. There is a marked difference in 
distribution of the codon compositions specifying the 

15 other 10 amino acids. For His, Gin, He, Val, and Asp, 

the dicot -preferred codons are found as the most abundant 
ones in both W-14 genes. For Leu, Thr, Ser, and Ala, the 
maize preferred codons are the most abundant codon 
choices found in the tcdA gene. In contrast, the tcbA 

20 gene contains only the CCG (Pro) maize-preferred codon as 
the highest abundance choice. 

In making the codon choices, doublet contents were 
considered, so that adjacent codons preferably did not 
form CG or TA doublets (which are underrepresented in 

25 eukaryotic genes; 1, 4), while CT or TG doublets (which 

are enriched in eukaryotic genes ibid . ) were created when 
possible . 

Choices were also made to utilize a diversity of 
codons for Met, Trp, Asn, Asp, Cys, Glu, His, He, Lys, 
30 Phe, Thr, and Tyr. 

The sequences were also designed to encode unique 6- 
bp recognition sites for restriction enzymes, spaced 
about every 1200 bp. Finally, an additional codon (GCT; 
Ala) was inserted at the second position to encode an Nco 
35 I recognition site encompassing the ATG (Met) start 

codon. Additional recognition sites were included after 
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the stop codon to facilitate subsequent cloning steps 
into expression vectors. These features are set forth 
above in Tables 1 and 2. 

The new tcdA and tcbA genes of SEQ ID NO: 3 and SEQ 
5 ID NO: 4 share 73. 5%, and 72.6%% identity, respectively, to 
their native W-14 counterparts (Wisconsin Genetics 
Computer Group, GAP algorithm) . 
B. Gene Synthesis 

The complete synthesis of the plant codon-biased 
10 tcbA and tcdA genes was performed under contract by 
Operon Technologies, Inc. (OPTI, Alameda, CA) . 
Basically, chemically synthesized oligonucleotides of 
appropriate sequence were assembled into DNA pieces about 
500 bases long. These were joined together end-to-end 
15 (presumably by means of appropriately placed restriction 
enzyme sites) into four larger pieces of roughly 2 
kilobase pairs (Jcbp) each; therefore each comprised about 
1/4 of the entire coding region of the particular gene. 
DNA sequence of the pieces was confirmed at this step. 
20 If mistakes in sequence were present, the appropriate 
oligonucleotides were re-synthesized, and the assembly 
process was repeated. Once gene fractional parts were 
sequence verified, they were assembled in pairs to make 
the gene halves, and again sequence verified. Finally, 
25 the two halves were joined, and the sequences of the 
junctions between the halves was verified. Therefore, 
each part of the new gene was sequence verified at least 
twice . 

It should be noted that attempts to express the 
30 native tcbA or tcdA genes in standard Escherichia coli 
cloning strains suggests that production of these 
proteins is lethal. Lethality problems may be 
encountered if standard cloning vectors having leaky 
expression from inherent lacZ promoters are used to 
35 assemble these genes. 
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C. Addition Of Endoplasmic Reticulum Targeting Peptide To 
Tcda Coding Region 

It is known to those in the field of plant gene 
expression that proteins are specifically directed into 
5 the endoplasmic reticulum (ER) by means of a short signal 
peptide which is removed during or after the transport 
process through the ER membrane. The mature (processed) 
protein is incorporated into the ER endomembrane or is 
released into the ER lumen where the transported protein 

10 may be uniquely folded (aided by chaperonins) , modified 
by glycosylation, accumulated in the vacuole, or 
additionally translocated (by secretion) . These 
processes are reviewed by Gomord and Faye [V. Gomord and 
L. Faye, (1996) Signals and mechanisms involved in 

15 intracellular transport of secreted proteins in plants. 
Plant Physiol. Biochem. 34:165-181] and by Bar-Peled et 
al. [M. Bar-Peled, D. C. Bassham, and N. V. Raikhel, 
(1996) Transport of proteins in eukaryotic cells: more 
questions ahead. Plant Molec. Biology 32:223-249]. It is 

20 also known that the subcellular recognition mechanisms 
for an ER signal peptide are evolutionarily somewhat 
conserved, since the ER signal for a protein normally 
produced in monocot (maize) cells is recognized and 
processed normally by dicot (tobacco) cells. This is 

25 exemplified by the maize 15 kDa zein ER signal peptide 

[L. M. Hoffman, D. D. Donaldson, R. Bookland, K. Rashka, 
and E. M. Herman, (1987) Synthesis and protein body 
deposition of maize 15-kd zein in transgenic tobacco 
seeds. EMBO J. 6:3213-3221, and U.S. Patent 5589616]. 

30 Further, it is known that the ER signal peptide derived 
from one protein can direct the translocation of a 
different protein if it is appropriately attached to the 
second protein by genetic engineering methods [D. C. Hunt 
and M. J. Chrispeels, (1991) The signal peptide of a 

35 vacuolar protein is necessary and sufficient for the 
efficient secretion of a cytosolic protein. Plant 
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Physiol. 96:18-25, and Denecke, J., J. Botterman, and R. 

Deblaere (1990) Protein secretion in plants can occur via 
a default pathway. Plant Cell 2:51-59]. Therefore, one 
may expose a protein in vivo to different biochemical 
5 environments by directing its accumulation in the cytosol 

(by not providing a signal peptide sequence), or in the 
ER/vacuole (by provision of an appropriate signal 
peptide. ) 

The ER signal peptide of maize 15 kDa zein proteins 
10 is known to comprise the first 20 amino acids encoded by 
the zein coding region. Two examples of such signal 
peptides the ER signal peptide of 15 kDa zein from A5707 
maize, NCBI Accession # M72708, and the ER signal peptide 
of 15 kDa zein from Black Mexican Sweet maize, NCBI 
15 Accession # M13507. There is only a single amino acid 
difference (Ser vs Cys at residue 17) between these 
signal peptides. 

SEQ ID NO: 5 is a modified sequence coding the ER 
signal peptide of 15 kDa zein from Black Mexican Sweet 
20 maize. The modifications embodied in this sequence were 
made to accommodate the different monocot/dicot codon 
usages and other sequence motif considerations discussed 
above in the design of the plant-optimized tcdA coding 
region. The sequence includes an additional Ala residue 
25 at position #2 to accommodate the Ncol site which spans 
the ATG start codon. 

SEQ ID NO: 6 gives a sequence coding for" the full- 
length native TcdA protein (amino acids 22-2537) fused to 
the modified 15 kDa zein endoplasmic reticulum signal 
30 peptide (amino acids 1-21). 

Example 2 

Transformation Of Tobacco With Agrobacterium Carrying 
Plasmid pDAB2041 Encoding Photorhabdus Toxins 
A. Plasmid pDAB2041 

35 Preparation of tobacco transformation vectors was 
accomplished in three steps. First, a modified plant- 
optimized tcdA coding region was ligated into a tobacco 
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plant expression cassette plasmid. In this step, the 
coding region was placed under the transcriptional 
control of a promoter functional in tobacco plant cells. 
RNA transcription termination and polyadenylation were 
5 mediated by a downstream copy of the terminator region 
from the Agrobacterium nopaline synthase gene. Two 
plasmids designed to function in this role are pDABl507 
and pDAB2006. In the second step, the complete gene 
comprised of the promoter/ coding region, and terminator 

10 region was ligated between the T-DNA borders of an 
AgroJbacteriu/n binary vector, pDABl542. Also positioned 
between the T-DNA borders was a plant selectable marker 
gene to allow selection of transformed tobacco plant 
cells. In the third step, the engineered binary vector 

15 plasmid was conjugated from its E. coli host strain into 
a disabled Agrojbacterium tumefaciens strain capable of 
transforming tobacco plant cells that regenerate into 
fertile transgenic plants. 

It is a feature of plasmid pDAB1507 that any coding 

20 region having an Ncol site at its 5' end and a Sad site 
3' to the coding region, when cloned into the unique Ncol 
and Sad sites of pDAB1507, is placed under the 
transcriptional control of an enhanced version of the 
CaMV 35S promoter. It is also a feature of pDAB1507 that 

25 the 5' untranslated leader (UTR) sequence preceding the 
Ncol site comprises a modified version of the 5' UTR of 
the MSV coat protein gene, into which has been cloned an 
internally deleted version of the maize AdhlS intron 1. 
Additionally it is a feature of pDAB1507 that 

30 transcription termination and polyadenylation of the mRNA 
containing the introduced coding region are mediated by 
termination/Poly A addition sequences derived from the 
nopaline synthase (Nos) gene. Finally, it is a feature 
of pDAB1507 that the entire assembly of promoter/coding 

35 region/3' UTR can be obtained as a single DNA fragment by 
cleavage at the flanking NotI sites. 
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It is a feature of plasmid pDAB2006 that any coding 
region having an Ncol site at its 5' end and a SacI site 
3' to the coding region, when cloned into the unique Ncol 
and SacI sites of pDAB2006, is placed under the 
5 transcriptional control of the CaMV 35S promoter. It is 
also a feature of pDAB2006 that the 5' untranslated 
leader (UTR) sequence preceding the Ncol site comprises a 
polylinker. Additionally it is a feature of pDAB2006 
that transcription termination and polyadenylation of the 

10 mRNA containing the introduced coding region are mediated 
by termination/Poly A addition sequences derived from the 
nopaline synthase (Nos) gene. Finally, it is a feature 
of pDAB2006 that the entire assembly of promoter/coding 
region/3' UTR can be obtained as a single DNA fragment by 

15 cleavage at the flanking NotI sites. 

It is a feature of pDAB1542 that any DNA fragment 
flanked by NotI sites can be cloned into the unique NotI 
site of pDAB1542, thus placing the introduced fragment 

20 between the T-DNA borders, and adjacent to the neomycin 
phosphotransferase II (kanamycin resistance) gene. 

To prepare a plant-expressible gene to produce the 
non-targeted TcdA protein in tobacco plant cells, DNA of 
a plasmid (pA0H_4-OPTI) containing the plant-optimized 

25 tcdA coding region, (SEQ ID No: 3) was cleaved with 

restriction enzymes Ncol and SacI, and the large 7550 bp 
fragment was ligated to similarly-cut DNA of plasmid 
pDAB1507 to produce plasmid pDAB2040. DNA of pDAB2040 
was then digested with NotI, and the 8884 bp fragment was 

30 ligated to NotI digested DNA of pDAB1542 to produce 

plasmid pDAB2041. This plasmid was then conjugated by 
triparental mating [ Firoozabady, E . , D. L. DeBoer, D. J. 
Merlo, E. L. Halk, L. N. Amerson, K. E. Rashka, and E. E. 
Murray (1987) Transformation of cotton (Gossypium 

35 hirsutum L. ) by Agrobacterium tumefaciens and 

regeneration of transgenic plants. Plant Molec. Biol. 
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10:105-116] from the host Escherichia coli strain (XL1- 
Blue, Stratagene, La Jolla, CA) , into the nontumorigenic 
Agrobacterium tumefaciens strain EHA101S, which is a 
spontaneous streptomycin-resistant mutant of strain 
5 EHA101 (Hood, E. E., G. L. Helmer, R. T. Fraley, and M . - 
D. Chilton (1986) The hypervirulence of Agrobacterium 
tumefaciens A281 is encoded in a region of pTiBo542 
outside of T-DNA. J. Bacterid. 168:1291-1301). Strain 
EHA101S (pDAB2041) was then used to produce transgenic 

10 tobacco plants that expressed the TcdA protein. 
B. . Plasmid pRK2013 

To prepare a plant-expressible gene to produce the 
endoplasmic reticulum-targeted TcdA protein in tobacco 
plant cells, DNA of a plasmid (pA0H_4-ER) containing the 

15 plant-optimized, ER-targeted tcdA coding region, (SEQ ID 
No: 6) was cleaved with restriction enzymes Ncol and Sad, 
and the large 7610 bp fragment was ligated to similarly- 
cut DNA of plasmid pDAB2006 to produce plasmid pDAB1833. 
DNA of pDAB1833 was then digested with NotI, and the 8822 

20 bp fragment was ligated to AfotI digested DNA of pDAB1542 
to produce plasmid pDAB2052. This plasmid was then 
conjugated by triparental mating from the host 
Escherichia coli strain (XLl-Blue) , into the 
nontumorigenic Agrobacterium tumefaciens strain EHA101S. 

25 Strain EHA101S (pDAB2052 ) was then used to produce 

transgenic tobacco plants that expressed the TcdA protein 
containing an amino terminus endoplasmic reticulum 
targeting peptide. 

30 C. Transfer of Plasmid pDAB2041 Into Agrobacterium Strain 
EHA101S 

Cultures of E. coli carrying the engineered Ti 

plasmid pDAB2041 (plasmid containing the rebuilt Toxin A 

gene, tcdA), E. coli carrying the plasmid pRK2013, and 

35 Agrobacterium strain EHA101S were grown overnight, then 

mixed 1:1:1 on plain LB medium solidified with agar and 
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cultured in the dark at 28°C. Two days later, the lawn of 
bacteria was scraped up with a loop, suspended in plain 
LB medium, vortexed, and then diluted 1 : 10 4 , 1 : 10 5 , and 
1:10 6 fold in plain LB liquid medium. Aliquots of these 
5 dilutions were spread on selective plates containing 

medium YEP plus erythromycin (100 mg/L) and streptomycin 
(250 mg/L) and grown at 28°C. Two days later, single 
colonies were picked and streaked onto the same medium, 
then spread to give single colonies. Single colonies were 
picked again and streaked, then spread for single 
colonies. Single colonies were picked a third time, grown 
as streaks, then subjected to a quality analysis 
involving growth on lactose medium and chromogenic assay 
with Benedict's reagent. Of ten strains developed in this 
15 way, the fastest coloring colony was chosen for further 
work. 



10 



D. Transformation Of Tobacco With Agrobacterlum Carrying 
Plasmid pDAB2041 
20 Tobacco transformation with Agrobacterium 

tumefaciens was carried out by a method similar, but not 
identical, to published methods (R Horsch et al, 1988. 
Plant Molecular Biology Manual, S. Gelvin et al, eds., 
Kluwer Academic Publishers, Boston) . To provide source 
25 tissue for the transformation, tobacco seed (Nicotiana 
tabacum cv. Kentucky 160) were surface sterilized and 
planted on the surface of TOB- , which is a hormone-free 
Murashige and Skoog medium (T. Murashige and F. Skoog, 
1962). A revised medium for rapid growth and bioassays 
30 with tobacco tissue culture. Plant Physiol. 75: 473-497) 
solidified with agar. Plants were grown for 6-8 weeks in 
a lighted incubator room at 28-30°C and leaves were 
collected sterilely for use in the transformation 
protocol. Approximately one cm 2 pieces were sterilely cut 
35 from these leaves, excluding the midrib. Cultures of the 
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Agrobacterium strains (EHA101S containing pDAB2041) , 
which had been grown overnight on a rotor at 28°C, were 
pelleted in a centrifuge and resuspended in sterile 
Murashige & Skoog salts, adjusted to a final optical 
5 density of 0.7 at 600 nm. Leaf pieces were dipped in 
this bacterial suspension for approximately 30 seconds, 
then blotted dry on sterile paper towels and placed right 
side up on medium TOB+ (Murashige and Skoog medium 
containing 1 mg/L indole acetic acid and 2.5 mg/L 

10 benzyladenine) and incubated in the dark at 28°C. Two 
days later the leaf pieces were moved to medium TOB+ 
containing 250 mg/L cefotaxime (Agri-Bio, North Miami, 
Florida) and 100 mg/L kanamycin sulfate (AgriBio) and 
incubated at 28-30°C in the light. Leaf pieces were moved 

15 to fresh TOB+' with cefotaxime and kanamycin twice per 
week for the first two weeks and once per week ■ 
thereafter. Leaf pieces which showed regrowth of the 
Agrobacterium strain were moved to medium TOB+ with 
cefotaxime and kanamycin, plus 100 mg/1 carbenicillin 

20 (Sigma) . Four to six weeks after the leaf pieces were 
treated with the bacteria, small plants arising from 
transformed foci were removed from this tissue 
preparation and planted into medium TOB- containing 250 
mg/L cefotaxime and 100 mg/L kanamycin in Magenta GA7 

25 boxes (Magenta Corp., Chicago). These plantlets were 
grown in a lighted incubator room. After 3-4 weeks the 
primary transgenic plants had rooted and grown to a size 
sufficient that leaf samples could be analyzed for 
expression of protein from the transgene. Twenty-five 

30 independent transgenic events were recovered as single 
plants from the pDAB2041 transformation. 

Eight independent lines expressing various levels of 
transgenic protein from the T-DNA of pDAB2041 were 
propagated in vitro from leaf pieces as follows. Twelve 

35 to sixteen approximately one cm 2 pieces were sterilely cut 

from leaves of each primary transgenic plant, excluding 
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the midrib and all naturally occurring edges. These leaf 
pieces were placed on medium TOB+ containing 250 mg/L 
cefotaxime and 100 mg/L kanamycin, and cultured in the 
lighted incubator at 28-30°C for 3-4 weeks, at which time 
5 small plants could be cut from the proliferating tissue 
mass. Several small plantlets from each transgenic line 
were moved into Magenta boxes containing medium TOB- plus 
cefotaxime and kanamycin and allowed to root and grow. 
The proliferating tissue mass was further cultured on 

10 medium TOB+ with cefotaxime and kanamycin, and additional 
plants could be cut out and grown up as needed. 

Plants were moved into the greenhouse by washing the 
agar from the roots, transplanting into soil in 5 W 
square pots, placing the pot into a Ziploc bag 

15 (DowBrands), placing plain water into the bottom of the 
bag, and placing in indirect light in a 30°C greenhouse 
for one week. After one week the bag could be opened; the 
plants were fertilized and allowed to grow further, until 
the plants were acclimated and the bag was removed. 

20 Plants were grown under ordinary warm greenhouse 

conditions (30°C, 16 H light). Plants were suitable for 
sampling four weeks post transplant. 



Example 3 

25 Chacterization Of Transgenic Tobacco Plants Expressing 
Photorhabdus Toxin That Confer Insect Control. 

A. Polyclonal Antibody Production 

The E. coll produced recombinant TcdA protein was 

30 purified by a series of column purification. The protein 
was sent to Berkley Antibody Company (Richmond, CA) for 
the production of antiserum in a rabbit. Inoculations 
with the antigen were initiated with 0.5 mg of protein 
followed by four boosting injections of 0.25 mg each at 

35 about three week intervals. The rabbit serum was tested 
by the standard Western analysis using the recombinant 
TcdA protein as the antigen and enhanced chemi- 
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luminescens, ECL method (Amersham, Arlington Heights, IL 
) .The antibodies (PAb-EA 0 ) were purified using a PURE I 
antibody purification kit (Sigma, St. Luis, MO). PAb-EA 0 
antibodies recognize the full-length TcdA and its 
5 processed components. 

B. Expression Of TcdA Protein In Tobacco 

Protein was extracted from the leaf tissue of 
transformed and non-transformed tobacco plants following 
the procedure described immediately below. 

10 Two leaf disks of 1.4 cm in diameter were harvested 

from the middle portion of a fully expanded leaf. The 
disks were placed on a 1.6 x 4 cm piece of 3M Whatman 
paper. The paper was folded lengthwise and inserted in a 
flexible straw. Four hundred micro liters of the 

15 extraction buffer (9.5 ml of 0.2 M NaH 2 P0 4 , 15.5 ml of 0.2 
M Na 2 HP0 4 , 2 ml of 0.5 M Na 2 EDTA, 100 ml of Triton X100, 1 
ml of 10% Sarkosyl, 78 ml of beta-mercaptoethanol, H 2 0 to 
bring total volume to 100 ml) was pipetted on to the 
paper. The straw containing the sample was then passed 

20 through a rolling device used for squeezing out the 

extract 1.5 mL micro centrifuge tube was placed at the 
other end of the straw to collect the extract. The 
extract was centrifuged for 10 minutes at 14,000 rpm in 
an Eppendorf regrigerated microcentrifuge. The 

25 supernatant was transferred into a new tube. Protein 
quantitation analysis was performed using the standard 
Bio-Rad Protein Analysis protocol (Bio-Rad Laboratories, 
Hercules, CA) . The extract was diluted to 2 mg/ml of 
total protein using the extraction buffer. 

30 For the detection of transgenic protein, Western 

blot analysis was . performed . Following a standard 
procedure for protein separation (Laemmli, 1970), 40 |ig 
of protein was loaded in each well of 4-20% gradient 
polyacrylamide gel (Owl Scientific Co., MA) for 

35 electrophoresis. Subsequently, the protein was 
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transferred onto a nitrocellulose membrane using a semi- 
dry electroblotter (Pharmacia LKB Biotechnology, 
Piscataway, NJ) . The membrane was incubated for one hour 
in Blotto (5% milk in TBST solution; 25 mM Tris HCL pH 
5 7.4, 136 mM NaCl, 2.7 mM KC1, 0.1% Tween 20). Thereafter 
, Blotto was replaced by the primary antibody solution 
(in Blotto) . After one hour in the primary antibody, the 
membrane was washed with TBST for five minutes three 
times. Then the secondary antibody in Blotto (1:2000 
10 dilution of goat anti-rabbit IgG conjugated to 

horseradish peroxidase; Bio-Rad Laboratories) . was added 
to the membrane. After one hour of incubation, the 
membrane was washed with an excess amount of TBST for 10 
minutes four times. The protein was visualized by using 
15 the enhanced chemi-luminescens, ECL method (Amersham, 

Arlington Heights, IL ) . The differential intensity of 
the protein bands were measured using densitometer 
(Molecular Dynamics Inc., Sunnyvale, CA) . 

To determine the expression of TcdA protein in 
20 tobacco transformed with pDAB2 041, PAb-EA 0 antibodies were 
used as the primary antibodies. The expression levels of 
TcdA protein varied among independent transformation 
events. The primary plant generated from the event 
#2041^13 showed the highest level of pre-pro TcdA 
25 expression of extractable protein. When the leaf pieces 
from this plant (#2041-13) were used in in vitro 
propagation, several plants were obtained. Seven of 
these plants were analyzed for the expression of the TcdA 
protein. All but one plant produced the full-length TcdA 
30 protein as well as some processed peptide components . 
Using the antibodies specific to Neomycin phospho- 
transferase, NPT (5 prime-3 prime, Boulder, Co), the 
expression the selectable marker gene (npt II) was 
detected. Similar results were obtained for #2041-29. 

35 

Table 5 
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Western analysis of plants derived from event #204 



10 



15 



20 



25 



Plant # 


TcdA 


NPT (selectable marker) 


2041-13A 


+ 


not uone 


2041-13B 


+ 


not done 


2041-13-1 




+ 


2041-13-2 


+ 


+ 


2041-13-3 


+ 


+ 


2041-13-4 


+ 


+ 


2041-13-5 


+ 


+ 
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C. Nucleic Acid Analysis of Transgenic Tobacco Lines 

Genomic DNA was prepared from a group of 2041 
transgenic events. The lines included Magenta box stage 
2041-13, and greenhouse stage plants 2041-13-1, 2041-13- 
2, 2041-13-5, 2041-9, 2041-20A and 2041-20B. A 
transgenic GUS line (2023) was included as a negative 
control. Southern analysis of these lines was performed. 
The genomic tobacco DNA was restricted with the enzyme 
SstI which should result in a 8.9 kb hybridization 
product when hybridized to a tcdA gene specific probe. 
The 8.9 kb hybridization product should consist of the 
35T promoter and the tcdA coding region. All 2041 plants 
contained a band of the expected size. Events 2041-9 and 
-20 appear to be the same line with 5 identical 
hybridizing bands. Event 2041-13 produced 6 
hybridization fragments with the tcdA coding region 
probe. Magenta box and various greenhouse plants of 
2041-13 all produced the same hybridization profile. 
This hybridization pattern was different from that of 
events 2041-9 and -20. 

RNA analysis, using the tcdA coding region probe, 
was performed on the same group of greenhouse 2041 
plants. Immunoblot analysis had revealed that plants 
2041-9, 2041-20A, 2041-20B, and 2041-13-1 produced no 
detectable TcdA protein; while 2041-13-2 and 2041-13-5 
produced substantial amounts of full-length TcdA. 
Northern analysis was in agreement with the immunoblot 
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result. A faint RNA signal was detected for plants 2041- 
9, 2041-20A, 2041-20B, and 2041-13-1. Only faintly 
visible was a band corresponding to full-length tcdA 
transcript in plant 2041-13.1. In contrast, for plants 
5 2041-13-2 and 2041-13-5 a strong RNA signal was detected, 
with a substantial amount of full-length size (-8.0 kb) 
tcdA transcript. These data support the observed 
bioassay activity for this group of plants. 

Genomic DNA was prepared from a second functionally 
10 active 2041 transgenic event, 2041-29. Southern analysis 
of this line was performed. A transgenic GUS line (2023) 
was included as a negative control, DNA of line 2041-9 
was included as a positive control. 

The genomic tobacco DNAs were restricted with the 
15 enzyme SstI which should result in a 8 . 9 kb hybridization 
product when hybridized to a tcdA gene specific probe. 
The 8.9 kb hybridization product should consist of the 
35T promoter and the tcdA coding region. For plant 2041- 
29-5, three hybridization products larger than 8 . 9 kb the 
20 were detected with the tcdA gene specific probe. 

Immunoblot analysis has demonstrated pre-pro TcdA protein 
is made by this plant, it is therefore likely that a 
restriction site was lost during transformation or 
regeneration, or the 2041-29 genomic DNA was not 
25 thoroughly digested. 

D. Tobacco Leaf -Disk Tests With Tobacco Hornworm 

Exhibiting Insect Control 

Leaves were sampled from tobacco plants, Nicotians 
30 tabaco, previously transplanted into the greenhouse. A 

single leaf was sampled from each plant on each test 

date. Leaves were selected from the zone where younger 

elongate leaves transition into older ovate leaves. 

Excised leaves were placed into 12 oz. cups with the 
35 petiole submerged in water to maintain turgor, and 

transported to the laboratory. 
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Eight, 1.4 cm disks were cut from the center portion 
of one side of each leaf (right adaxial side up, with 
distal portion facing away from the observer) . Each disk 
was placed individually into a well of a C-D 
5 International 128 well tray (Pitman, NJ.) into which 0.5 
ml of a 1.6% aqueous agar solution had been previously 
pipetted. The solidified agar prevented the leaf disks 
from drying out. The adaxial surface of the disk was 
always oriented up. 

10 A single neonate tobacco hornworm, Manduca sexta, 

was placed on each disk and the wells were sealed with 
vented plastic lids. The assay was held at 27°C and 40% 
RH. Larval mortality and live-weight data were collected 
after 3 days. Data were subjected to analysis of 

15 variance and Duncan's multiple range test (a = 0.05) (Proc 
GLM, SAS Institute Inc., Cary, NC). Data were 
transformed using a logarithmic function to correct a 
correlation between the magnitude of the mean and 
variance . 

20 Table 6 

Results of leaf-disk assays from greenhouse grown tobacco 





Weight of Surviving 


; Larvae (mg) & Duncan's Group 1 


TRT 


Plant 


Plant 


Pretes 


Test 1 


Test 2 


Test 3 


3 Test 






Age 


t 








Sum. 


13 


non-transformed - 2 


young 








18.8 a* 




14 


non-transformed - 3 


young 








17.0 ab 




16 


non-transformed - 5 


young 








16.4 ab 




3 


2041-13-1 (western -) 


young 




17.6 a 


18.2 a 


16.1 ab 


17.3 a 


9 


Gus Control 


old 


19.3 a 


14.6 a 


16.3 a 


14.5 ab 


15.1 a 


10 


non- transformed - 1 


young 




8.3 b 


16.8 a 


13.9 b 


13.0 b 


11 


204 1-2 0B (western -) 


old 




10.0 b* 


13.7 ab 


14.6 ab 


12.9 b 


15 


non-transformed - 4 


young 








13.0 be 




8 


2041-20A (western -) 


old 


15.7 a 


8.3 b 


11.3 be 


9.2 cd 


9.6 c 


12 


2041-9 (western -) 


old 


19.5 a 






7.9 d 




7 


2041-13-5 (western +) 


young 




6.3 be 


9.6 cd 


7.2 de 


7.7 d 


5 


2041-13-3 (western +) 


young 




6.4 


6.2 e 


6.8 de** 


6.4 de 










be**** 








1 


204 1-1 3 A (western +) 


old 


7.2 b 


6.8 be* 


7.0 de* 


5.4 e 


6.4 de 


6 


2041-13-4 (western +) 


young 




4.9 c**** 


5.8 e 


7.6 d 


6.4 de 


4 


2041-13-2 (western +) 


young 




5.7 be 


5.7 e** 


7.5 d 


6.3 de 


2 


2041- 13B (western +) 


old 




4.7 c** 


5.6 e 


7.2 de 


5.9 e 



* Number of stars corresponds to the number of dead 
larvae per 8 tested. 
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1. Data transformed (logarithm) for analysis. 

Means followed by the same letter are not significantly 

different (alpha = 0.05). 



10 



15 



20 



25 



30 



TABLE 7 

Results Of Leaf-Disk Assays From Greenhouse Grown Tobacco 

Plants 
With Event 2041-29. 



Plant 


Testl 


Test 2 


Test 3 


Test 4 


Four Test 
Summary 


2014-6 GUS 1 


15.8 a 


16.6a 


**5.5bc 


♦12.9ab 


13.2 a 


2014-6 GUS 2 


14.4 a 


♦6.6 be 


♦13.4a 


15.2a 


12.6 a 


KY-160 NTC 


13.4 a 


6.7 be 


7.9b 


8.5bc 


9.1 b 


2041-29 4P 


♦4.9 b 


♦7.3b 


♦♦♦♦6.9b 


******** 


6.3 c 


2041-29 7 


♦5.9 b 


5.1bc 


♦♦♦6.7b 


♦♦♦7.2c 


6.1 c 


2041-29 3P 


*5.6b 


**7.9b 


♦♦♦♦♦6.5b 


♦♦♦3.6d 


5.9 c 


2041-29 2P 


6.3 b 


****4.7c 


♦♦♦♦♦♦4.1c 


♦♦♦♦♦♦4.6d 


5.4 c 



larvae per 8 tested. 

1. Data transformed (logarithm) for analysis. 

Means followed by the same letter are not significantly 

different (alpha = 0.05). 

All event 2041-29 plants significantly depressed THW 
larval weight gain compared to control plants. Average 
weight depression was 49%. Statistically significant 
mortality occurred in THW larvae exposed to foliage from 
2041-29 plants. Mortality averaged 37.5% compared to 
5.2% in controls. 

E. Isolation and Characterization of Functional 
Photorhabdus Toxin Protein From Transgenic Plants 

Seven grams of transgenic tobacco plants (2041-13) 
expressing TcdA (Toxin A) gene were homogenized with 10 
ml 50 mM Potassium Phosphate buffer, pH 7.0 using a bead 
beater (Biospec Products, Bartlesville, OK) according to 
manufacturer's instructions. The homogenate was filtered 
through four layers of cheese cloth and then centrifuged 
at 35,000 g for 15 min. The supernant was collected and 
filtered through 0.22 ^m Millipore Express™ membrane. It 
was then applied to a Superdex 200 cloumn (2.6 x 4 0 cm) 
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which had been equilibrated with 20 mM Tris buffer, pH 
8.0 (Buffer A). The protein was eluted in- Buffer A at a 
flow rate of 3 ml/min. Fractions with 3 ml each were 
collected and subjected to southern corn rootworm (SCR) 
5 bioassay. It was found that fractions corresponding to a 
native molecular weight around 860 kDa had the highest 
insecticidal activity. Western analysis of the active 
fraction using a polyclonal antibody specific to Toxin A 
indicated the presence of full-length TcdA peptide. The 

10 active fractions were further combined and applied to a 
Mono Q 10/10 column which had been equilibrated with 
Buffer A. Proteins bound to the column were then eluted 
by a linear gradient of 0 to 1 M NaCl in Buffer A. 
Fractions with 2 ml each were collected and analyzed by 

15 both SCR bioassay and Western using antibody specific to 
Toxin A. The results again demonstrated the correlation 
between insecticidal activity and presence of full-length 
TcdA peptide. 

20 F. Characterization of Progeny Transgenic Plants 

The inheritability of the genetically engineering 
plants containing the Photorhabdus toxin gene was 
evaluated by generating Fl progeny. Progeny was 
generated from 2041-13 event by selfing expression 

25 positive plants. The 2041-13 plants in the greenhouse 
were allowed to self -pollinate . Seed capsules were 
collected when mature and were allowed to dry and after- 
ripen on the laboratory bench for two weeks. Seed from 
plant designated 2041-13A was surface-sterilized and 

30 distributed on the surface of medium TOB- without 
selection, to allow recovery of nonexpressing or 
nontransgenic progeny as well as expressing and 
segregating transgenic siblings. Seed was germinated in 
a C lighted incubator room (16 H light, 28 C) . After 1 

35 month, fifty-one seedlings, designated 2041-13A-S1 
through S51, were distributed into Magenta boxes 
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containing medium TOB- to grow further. Three weeks 
later, leaf samples from these Magenta-box grown 
seedlings were ' submitted for evaluation of the level of 
expression of TcdA toxin. 
5 Leaf samples were tested for kanamycin response, by 

placing sterile leaf segments on medium TOB+ containing 
100 mg/L kanamycin in the light and scoring for tissue 
growth and color after two weeks. All leaf pieces showed 
some positive response, indicating. complex segregation. 

This group of in vitro grown event 2041-13 progeny 
seedlings were all transplanted into the greenhouse 
approximately two months after seeding onto medium, using 
the following method. After washing the agar from the 
roots, _plants were transplanted into 5 H inch square pots 
15 in a soil mix containing 75% MetroMix and 25% mineral 
soil. They were enclosed in a zip-lock bag and plain 
water added to leave 1-2 inches of water in the bottom of 
the bag after soil absorption. These bags were closed and 
placed under a cart in the greenhouse to protect them 
20 from direct sunlight. The bags were opened after 5-6 
days, and removed after 7 days, when the plants were 
adapted to soil and were moved to the top of the cart for 
normal greenhouse culture. Plants were ready to test in 
insect bioassays at four weeks post transplant. 
25 Fl progeny were evaluated for expression of protein 

toxin by immunological screen and for biological activity 
by plant bioassays, as described previously, using 
tobacco hornworm. There existed a positive correlation 
between levels of expression protein toxin and degree of 
30 growth inhibition and at higher expression levels 

mortality was observed. The biological activity was 
observed to be statistical significance with high 
cofidence levels between populations of non-transformed 
and transformed expressing protein toxin. 
35 The following table summarizes the results of insect 

(tobacco hornworm) bioassays conducted with Fl progeny of 
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self-fertilized 2041-13 plants genetically engineered to 
produce the "204" A toxin. The tests included 6 non- 
expressing progeny (protein-negative controls), 45 toxin 
A expressors, and 4 non-transformed controls (KY-160) . 
5 Results are from three leaf-disk assays (method 

previously outlined) where eight disks were used per 
test. The data were analyzed using analysis of variance 
and were blocked by test. 

The treatment effect for each of these analyses 

10 indicated the Pr > F was less than 0.0001. The Toxin A 
expressors produced significant control of tobacco 
hornworm compared to each of the control groups based on 
each of the three measures of efficacy. The two control 
groups behaved similarly. Statistical analysis using 

15 ANOVA and an LSD test with alpha equal to 0.01 (or 1%) 
showed differences between the 3 groups. The LSD test 
indicated that the non-expressors and the non-transformed 
plants were similar in larvae weights but the expressors 
gave weights significantly lower than either of the other 

20 two groups of plants. These data demonstrated that the 
genetic basis for insect control was inheritable and 
corresponded to the presence of expressed toxin gene. 

Table 8 

Tobacco hornworm results from Fl progeny of self- 
25 fertilized 



2041-13 tobacco plants. 





Mean Value and Duncan's Grouping* 1 


Treatment Group 


Total Weight (mg) a 


Survivor Weight (mg) b 


Leaf Area (cm ) c 


Non-transformed Control 


15.8 a 


15.8 a 


1.2 a 


Protein-negative Control 


16.4 a 


16.5 a 


1.2 a 


Toxin A Expressor 


8.1b 


9.2 b 


4.9 b 



a Average insect weight with dead insects considered to 
weigh nothing. 

b Average insect weight with dead insects excluded from 



30 analysis. 

c Total leaf area remaining per eight leaf disks. Initial 
area was approximately 12 cm 2 . 

d Means followed by the same letter are not significantly 
different (alpha = 0.05). 
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Example 4 

Transformation Of Maize With a Vector Carrying Plasmid 
pDAB1834 Encoding Photorhabdus Toxins 

A. Preparation Of Maize Transformation Vectors 
Containing Modified Plant-Optimized Tcda Coding Regions: 
Plasmid Pdabl834 



Preparation of maize transformation vectors was 
accomplished in two steps. First, a modified plant- 
optimized tcdA coding region was ligated into a plant 
expression cassette plasmid. In this step, the coding 

15 region was placed under the transcriptional control of a 
promoter functional in maize plant cells. RNA 
transcription termination and polyadenylation were 
mediated by a downstream copy of the terminator region 
from the Agrobacterium nopaline synthase gene. One 

20 plasmid designed to function in this role is pDAB1538. In 
the second step, the complete gene comprised of the 
promoter, coding region, and 3' UTR terminator region was 
ligated to a plant transformation vector that contained a 
plant expressible selectable marker gene which allowed 

25 the selection of transformed maize plant cells amongst a 
background of nontransf ormed cells. An example of such a 
vector is pDAB367. 

It is a feature of plasmid pDAB1538 that any coding 
region having an Ncol site at its 5' end and a Sad site 

30 3' to the coding region, when cloned into the unique A7coI 
and Sad sites of pDAB1538, is placed under the 
transcriptional control of the maize ubiquitinl (ubil) 
promoter. It is also a feature of pDAB1538 that the 5' 
untranslated leader (UTR) sequence preceding the Ncol 

35 site comprises a polylinker. Additionally it is a 

feature of pDAB1538 that transcription termination and 
polyadenylation of the mRNA containing the introduced 
coding region are mediated by termination/Poly A addition 
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sequences derived from the nopaline synthase (Nos) gene. 
Finally, it is a feature of pDABl538 that the entire 
assembly of promoter/coding region/3' UTR can be obtained 
as a single DNA fragment by cleavage at the flanking Notl 
5 sites. 

It is a feature of pDAB367 that the phosphinothricin 
acetyl transferase protein, which has as its substrate 
phosphinothricin and related compounds, is produced in 
plant cells through transcription of its coding region 

10 mediated by the Cauliflower Mosaic Virus 35S promoter and 
that termination of transcription plus polyadenylation 
are mediated by the nopaline synthase terminator region. 
It is further a feature of pDAB367 that any DNA fragment 
containing flanking Notl sites can be cloned into the 

15 unique Notl site of pDAB367, thus physically linking the 
introduced DNA fragment to the aforementioned selectable 
marker gene. 

To prepare a maize plant-expressible gene to produce 
the endoplasmic reticulum-targeted TcdA protein in plant 

20 cells, DNA of a plasmid (pA0H_4-ER) containing the plant- 
optimized, ER-targeted tcdA coding region, (SEQ ID No: 6) 
was cleaved with restriction enzymes Ncol and Sad, and 
the large 7610 bp fragment was ligated to similarly-cut 
DNA of plasmid pDAB1538 to produce plasmid pDAB1832. DNA 

25 of pDAB1832 was then digested with Notl, and the 9984 bp 
Notl fragment was ligated into the unique Notl site of 
pDAB367 to produce plasmid pDAB1834. 

It is a feature of plasmids pDAB1834 that the ubil 
and 35S promoters are encoded on the same DNA strand. 

30 

B. Transformation and Regeneration of Transgenic Maize 
Isolates 

Type II callus cultures were initiated from immature 
zygotic embryos of the genotype "Hi-II." (Armstrong et 
35 al, (1991) Maize Genet. Coop. Newslett., 65: 92-93). 
Embryos were isolated from greenhouse-grown ears from 
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crosses between Hi-Il parent A and Hi-II parent B or F 2 
embryos derived from a self- or sib-pollination of a Hi- 
ll plant. Immature embryos (1.5 to 3.5 mm) were cultured 
on initiation medium consisting of N6 salts and vitamins 
5 (Chu et al, (1978) The N6 medium and its application to 
anther culture of cereal crops. Proc. Symp. Plant Tissue 
Culture, Peking Press, 43-56), 1.0 mg/L 2,4-D, 25mM L- 
proline, 100 mg/L casein hydrolysate, 10 mg/L AgN0 3 , 2.5 
g/L GELRITE (Schweizerhall, South Plainfield, NJ) , and 20 
g/L sucrose, with a pH of 5,8. After four to six weeks 
callus was subcultured onto maintenance medium 
(initiation medium in which AgN0 3 was omitted and L- 
proline was reduced to 6 mM) . Selection for Type II 
callus took place for ca. 12-16 weeks. 

Plasmid pDAB1834 was transformed into embryogenic 
callus. For blasting, 140 ug of plasmid DNA was 
precipitated onto 60 mg of alcohol-rinsed, spherical gold 
particles (1.5 - 3.0 um diameter, Aldrich Chemical Co., 
Inc., Milwaukee, WI) by adding 74 uL of 2 . 5M CaCl 2 H 2 0 and 
30 P L of 0.1M spermidine (free base) to 300 uL of plasmid 
DNA and H 2 0. The solution was immediately vortexed and • 
the DNA-coated gold particles were allowed to settle. 
The resulting clear supernatant was removed and the gold 
particles were resuspended in 1 ml of "absolute ethanol. 
This suspension was diluted with absolute ethanol to 
obtain 15 mg DNA-coated gold/mL. 

Approximately 600 mg of embryogenic callus tissue 
was spread over the surface of Type II callus maintenance 
medium as described herein lacking casein hydrolysate and 
L proline, but supplemented with 0.2 M sorbitol and 0.2 M 
mannitol as an osmoticum. Following a 4 h pre-treatment , 
tissue was transferred to culture dishes containing 
blasting medium (osmotic media solidified with 20 g/L TC 
agar (PhytoTechnology Laboratories, LLC, Shawnee Mission, 
KS) instead of 7 g/ L GELRITE. Helium blasting 
accelerated suspended DNA-coated gold particles towards 
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and into the prepared tissue targets. The device used 
was an earlier prototype of that described in US Patent 
5,141,131 which is incorporated herein by reference. 
Tissues were covered with a stainless steel screen (104 
5 pm openings) and placed under a partial vacuum of 25 

inches of Hg in the device chamber. The DNA-coated gold 
particles were further diluted 1:1 with absolute ethanol 
prior to blasting and were accelerated at the callus 
targets four times using a helium pressure of 1500 psi, 

10 with each blast delivering 20 pL of the DNA/gold 

suspension. Immediately post-blasting, the tissue was 
transferred to osmotic media for a 16-24 h recovery 
period. Afterwards, the tissue was divided into small 
pieces and transferred to selection medium (maintenance 

15 medium lacking casein hydrolysate and L-proline but 

containing 30 mg/L BASTA® (AgrEvo, Berlin, Germany) ) . 
Every four weeks for 3 months, tissue pieces were non- 
selectively transferred to fresh selection medium. After 
7 weeks and up to 22 weeks, callus sectors found 

20 proliferating against a background of growth-inhibited 

tissue were removed and isolated. The resulting BASTA®- 
resistant tissue was subcultured biweekly onto fresh 
selection medium. Following western analysis, positive 
transgenic lines were identified and transferred to 

25 regeneration media. Western-negative lines underwent 
subsequent RNA spot blot analysis to identify negative 
controls for regeneration. 

Regeneration was . initiated by transferring callus 
tissue to cytokinin-based induction medium, which 

30 consisted of Murashige and Skoog salts, hereinafter MS 

salts, and vitamins (Murashige and Skoog, (1962) Physiol. 
Plant. 15: 473-497) 30 g/L sucrose, 100 mg/L myo- 
inositol, 30 g/L mannitol, 5 mg/L 6-benzylaminopurine, 
hereinafter BAP, 0.025 mg/L 2,4-D, 30 mg/L BASTA®, and 

35 2.5 g/L GELRITE at pH 5.7. The cultures were placed in 

low light (125 ft-candles) for one week followed by one 
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week in high light (325 f t-candles) . Following a two 
week induction period, tissue was non-selectively 
transferred to hormone-free regeneration medium/ which 
was identical to the induction medium except that it 
5 lacked 2,4-D and BAP, and was kept in high light. Small 
(1.5-3 cm) plantlets were removed and placed in 150x25 mm 
culture tubes containing SH medium (SH salts and vitamins 
(Schenk and Hildebrandt, (1972) Can. J. Bot . 50:199-204), 
10 g/L sucrose, 100 mg/L myo-inositol, 5 mL/L FeEDTA, and 
10 2.5 g/L GELRITE, pH 5.8). Plantlets were transferred to 
12 cm pots containing approximately 0.25 kg of METRO-MIX 
360 (The Scotts Co. Marysville, OH) in the greenhouse as 
soon as they exhibited growth and developed a sufficient 
root system. They were grown with a 16 h photoperiod 
15 supplemented by a combination of high pressure sodium and 
metal halide lamps, and were watered as needed with a 
combination of three independent Peters Excel fertilizer 
formulations (Grace-Sierra Horticultural Products 
Company, Milpitas, CA) . At the 6-8 leaf stage, plants 
20 were transplanted to five gallon pots containing 

approximately 4 kg METRO-MIX 360, and grown to maturity. 

EXAMPLE 5 

Characterization Of Transgenic Maize Plants 
25 Expressing Photorhabdus Toxin That Confer Insect Control. 
A. Insect Bioassays 

A single leaf was sampled from each plant in each 
test. Eight, 1.4 cm disks were cut from the outer portion 
of each leaf (approximately 30cm long) avoiding the 
30 center vein. Each disk was placed individually into a 
well of a C-D International 128 well tray (Pitman, NJ. ) 
into which 0.5 ml of a 1.6% aqueous agar solution had 
been previously pipetted. The solidified agar prevented 
the leaf disks from drying out. The adaxial surface of 
35 the disk was always oriented up. 
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Five neonate southern corn rootworms, Diabrotica 
undecimpunctata howardi, were placed on each disk and the 
wells were sealed with vented plastic lids. The assay 
was held at 27°C and 40% RH. Larval mortality and live- 
5 weight data were collected after 3 days. Data were 

subjected to analysis of variance and Duncan's multiple 
range test (a = 0.05) (Proc GLM, SAS Institute Inc., Cary, 
NC). Weight data were transformed using a logarithmic 
function to correct a correlation between the magnitude 
10 of the mean and variance. 

TABLE 9 



Results of Maize Leaf-disk Test vs SCR 



Treatment 


Mean % Kill 
(Duncan' s) 


Mean Survival 
Weight (mg) 
(Duncan' s) 


1834 - 


- 11 


68 A 


0 


.064 A 


1834 - 


- 17 


44 B 


0 


.098 B 


1834 - 


- 15 


26 BC 


0 


.127 C 


Hill control 


13 C 


0 


.161 C 



Note: Means followed by the same letter are not 



15 significantly different based on Duncan's multiple range 
test (alpha=0 . 05) . Insect groups weighing less than 0.1 
mg were set to 0.03 mg instead of zero to conduct a more 
conservative analysis. Mortality (arcsin (sqrt ).) and 
weight (loglO) data were transformed for analyses . 

20 

The results shown in Table 9 demonstrated that two events 
expressing TcdA protein were statistically distinct from 
control lines bioassayed using SCR neonates by mortality and 
survival weight criteria. These results demonstrated that 
25 southern corn rootwprm were functionally effected by feeding 
on maize plants containing and expressing the tcdA gene. 
Those plants from 1834-11 were used to generate progeny for 
testing of inheritability of transgene. 
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B. PRODUCTION AND PROGENY TEST OF tcdA TRANSGENIC MAIZE 

Origin and growth of prog eny plants: Sibling plants 1834- 
11-07 and 1834-11-08 , clonally derived by regeneration 
5 from the callus of transgenic maize event 1834-11, were 
transplanted to the greenhouse and pollinated with inbred 
OQ414. Seeds obtained from these crosses, comprising seed 
lots 1834-11-07A and 1834-11-08A, were planted in 
Rootrainers (1 M inch x 2 inch x 8 inch deep, product 

10 #647, C. Hummert Intl., Earth City, Mo.) filled with 

Metro-Mix 360 soilless mix (Scotts Terra-Lite, available 
from Hummert Intl.) and top irrigated with Hoagland's 
nutrient solution. (Hoagland's solution contains 229 ppm 
nitrogen as nitrate, 24.6 ppm nitrogen as ammonium, 26 

15 ppm P, 157 ppm K, 187 ppm Ca, 49 ppm Mg. and 30 ppm Na.) 

Greenhouse conditions for this trial were: 16 hour 
days, daylight supplemented by metal halide lamps as 
needed to achieve a minimum of 600 ?Einsteins/cm 2 PAR, and 
ambient temperature 30 C days, 22 C nights. 

20 

Leaves were sampled for protein determination 
approximately one week after planting. Leaf bioassays 
were conducted 2-3 weeks after planting; root bioassays 
were initiated approximately 3 weeks post planting. 

25 

Protein analysis of progeny plants: Protein was extracted 
from leaf and root samples harvested from transgenic 
plants, line 1834-11 progenies, and non-transformed 
plants. Each sample was placed on a 1 . 6 x 4 cm piece of 

30 3M Whatman™ paper. The paper was folded lengthwise and 
inserted in a flexible straw. A volume of 350 nl of an 
extraction buffer ( 9 . 5 ml of 0 . 2 M NaH 2 P0 4/ 15.5 ml of 0.2 
M Na 2 HP0 4 , 2 ml of 0.5 M Na 2 EDTA, 100 ml of Triton X-100, 
1 ml of 10% Sarkosyl, 78 ml of beta-mercaptoethanol, H 2 0 

35 to bring total volume to 100 ml, 50 jig/ml Antipain, 50 
^g/ml Leupeptin, 0.1 mM Chymostatin, 5 ^ig/ml Pepstatin) 
was pipetted on to the paper. The straw containing the 
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sample was then passed through a rolling device used for 
squeezing the extract into a 1.5 ml microcentrifuge tube. 
The extract was centrifuged for 10 minutes at 14,000 rpm 
in an Eppendorf refrigerated micro-centrifuge. The 
5 supernatant was transferred into a new tube. The amount 
of the total extractable protein was determined using a 
standard BioRad Protein Analysis protocol (BioRad 
Laboratories, Hercules, CA) . 

The presence of the TcdA protein was visualized by 

10 Western blot analysis following a standard procedure for 
protein separation (Laemmli, 1970) . A volume of twenty 
|xl of extract was loaded in each well of 4-20% gradient 
polyacrylamide gel (Owl Scientific Co., MA) for 
electrophoresis. Subsequently, the protein was 

15 transferred onto a nitrocellulose membrane using a semi- 
dry electroblotter (Pharmacia LKB Biotechnology, 
Piscataway, NJ) . The membrane was incubated for one hour 
in TBST-M solution (10% milk in TBST solution; 25 mM Tris 
HCL pH 7.4, 1.36 mM NaCl, 2.7 mM KC1, 0.1% Tween 20). 

20 Thereafter, the primary antibody (Anti-TcdA in TBST-M) 

was added. After one hour, the membrane was washed with 
TBST for five minutes, three times. Then the secondary 
antibody solution (goat anti-rabbit IgG conjugated to 
horseradish peroxidase; Bio-Rad Laboratories, in TBST-M) 

25 was added to the membrane. After one hour of incubation, 
the membrane was washed with an excess amount of TBST for 
10 minutes, four times. The protein was visualized using 
the Super Signal® West Pico chemiluminescence method 
(Pierce Chemical Co., Rockford, IL) . The protein blot 

30 was exposed on a Hyper-film (Amersham, Arlington Heights, 
IL) and was developed within 3 minutes. The intensity of 
the protein band was measured using a densitometer 
(Molecular Dynamics Inc., Sunnyvale, CA) and compared to 
standards . 

35 Three of six plants from seed lot 1834-11-07A and 

three of six plants from seed lot 1834-11-08A produced 
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detectable levels of TcdA protein (Table 1) . 
Approximately 3.8 to 13.3 ppm of TcdA were detected in 
the leaf blades and 4.1 to 8.4 ppm were detected in the 
leaf tips of the protein-positive plants. The amounts of 
5 TcdA protein detected in the roots were slightly lower 
than those found in the leaves. 



Insect bioassays with progeny plants: Plants were 
selected for bioassay based on results from Western blot 
10 analysis. Twelve (12) , 6.4 mm diameter leaf discs were 
cut from the youngest leaf of each 2 week old seedling. 
Each disc was placed in a well of a 128 -well tray (CD 
International) containing approximately 0.5mL of a 
solidified 2% agar in water solution. Two neonate 
15 southern corn rootworm, Diabrotica undecimpunctata 

howardi (Barber) (SCR) , were placed in each well with a 
leaf disc. Trays were covered with perforated lids and 
maintained under a controlled environment for 3 days (28 
C; 16 hours light: 8 hours dark; approx. 60% relative 
20 humidity) . Living larvae from 4 leaf discs were pooled 
and weighed producing 3 weight determinations per plant. 
Average weights were calculated by dividing the pooled 
weight by the number of survivors. Differences in 
average weights of SCR fed leaf discs from protein 
positive and protein negative plants were assessed using 
analysis of variance on the natural log -trans formed 
average weights (Minitab, v. 12.2, Minitab Inc., State 
College, PA) . 

Root bioassays were initiated approximately 1 week 
after the initiation of the leaf disc bioassays. 
Approximately 24h prior to eclosion, . SCR eggs were 
suspended in a 0.15% solution of agar in water to a 
concentration of 100 eggs/ml. Plants were inoculated 
with SCR eggs by pipetting 2.0 ml of the egg suspension 
(ie., approximately 200 eggs) just below the soil .surface 
at the base of each plant. Two weeks after inoculation, 
plants were removed from their Rootrainer pots, their 
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roots washed free of potting mix, and scored for rootworm 
damage based on a 1 (resistant) to 9 (susceptible) rating 
system (Welch, 1977) . The results of the root ratings 
were examined using non-parametric tests to determine if 
5 the distribution of root ratings from the protein 

positive plants was the same as the distribution of the 
ratings from the protein negative plants. Testing was 
done at the 5% significance level. (StatXact v. 3, CYTEL 
Software Corporation, Cambridge MA) 

10 

Results from leaf and root bioassays of tcdA protein 
positive and protein negative progeny plants are 
summarized in Table 10. The average weights of SCR 
larvae fed leaf discs from protein positive plants were 

15 significantly lower than those of larvae fed leaf discs 
from protein negative plants (F = 4.6; d.f. =1, 34; P < 
0.001. The Kolmogorov-Smirnov 2 sample test (p=0.04) and 
the Wald Wolfowitz runs test (p=0.001) indicated that 
the protein positive and protein negative root rating 

20 distributions were not similar. The Wilcoxon- Mann- 
Whitney test (p=0.0206) and the Normal Scores test 
(p=0.206) indicated that the average score for the 
protein positive plants was lower than the average root 
rating from the protein negative plants. 

25 

Table 10. Protein analysis and insect bioassay results 
with progeny of TcdA transgenic maize. 



Plant 


TcdA 


Leaf Disc 


Root Bioassay 






Bioassay 




Number 


Protein 


Avg. Wt. (mg) 


Root Rating 








(1-9) 


1834-11-07A-30 


PRO- 


0.190 


8 


1834-11-08A-21 


PRO- 


0.196 


9 


1834-11-08A-16 


PRO- 


0. 195 


9 


1834-11-08A-14 


PRO- 


0. 137 


9 


1834-11-07A-22 


PRO- 


0 .208 


9 


1834-11-07A-20 


PRO- 


0. 175 


9 
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loj^-ll - U7A-26 


PRO+ 


0.118 


9 


1834-11-08A-17 


PRO+ 




Q 

o 


1834-11-07A-14 


PR0+ 


0.110 


2 


1834-11-07A-11 


PR0+ 


0.106 


4 


1834-11-08A-28 


PRO+ 


0.129 


8 ; 


1834-11-08A-27 


PRO+ 


0.108 


4 



DNA analysis of progeny plant* ; Leaf samples from 1834- 
11. 7A and 1834-11. 8A progeny plants were in conical 50 ml 
polypropylene tubes and dried in a Labconco Freeze Dry 
5 Lyophilizer (Kansas City, MO) for 1-2 days. Lyophilized 
leaves were then ground in a Tecator Cyclotec 1093 Sample 
mill grinder (Hoganas, Sweden) and stored at -20C. 
Genomic DNA was extracted by the following procedure: (1) 
to a 25 ml Conical tube containing 300-500 mg of ground 

10 tissue, 9 ml of CTAB (cetyl trimethylammonium bromide 

solution) was added, and incubated at 65°C for 1 hour; (2) 
4.5 ml of chloroform: octanol (24:1) was added and mixed 
gently for 5 minutes; (3) samples were centrifuged at 
2 000 rpm and DNA was precipitated from the supernatant 

15 with an equal volume of isopropanol; (4) DNA was 
collected on a glass hook, washed in ethanol, and 
dissolved in TE (10 mM Tris.HCl, 0.5 mM EDTA, pH8 . 0) . 

Genomic DNA was digested at 37 °C. for 2 hours in an 
20 Eppendorf tube containing the following mixture: 

8 nl of 800ug/ml DNA, 2 jil 1 mg/ml BSA (Bovine serum 
albumin), 2 jil lOx buffer, 1 jil Sad, 1 ^1 EcoRI , and 6 nl 
H20. Digested DNA samples were electrophoresed overnight 
at 40 mA in a 0.85% SeaKem LE agarose gel(FMC, Rockland, 
25 Maine) . The gel was blotted onto Millipore Immobilon-Ny+ 
(Bedford, MA) membrane overnight in 20X SSC (NaCl 175.2 
g/1, Na citrate 88 g/1) . The probe DNA was cut with 
BamHI/SacI (NEB, Beverly, MA) from pDAB1551 plasmid, 
which released a 7356 bp fragment containing the open 
30 reading frame of the rebuilt tcdA gene. This 7356 bp 

fragment was labeled with P32 using a Stratagene Prime- it 
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RmT dCTP- Labeling Reactions kit (La Jolla, CA) and used 
for Southern hybridization. Hybridization was conducted 
in hybridization buffer (10% polyethylene glycol, 7% SDS 
[Sodium dodecyl sulfate], 0.6X SSC, 10 iriM NaP0 4 , 5 triM 
5 EDTA, 10 M-g/ml denatured salmon sperm) at 60 °C overnight. 
After hybridization, the membrane was washed with 10X SSC 
plus 0.1% SDS at. 60 °C for 30 min and exposed to X ray 
film (Hyperfilm® MP, Amershan Life Sciences, Piscataway, 
NJ) for 1-2 days. 

10 

Results summarized indicate that a pattern of 8 
hybridizing bands (the size of the expected fragment and 
larger) cosegregated with protein expression in 50% of 
all progeny assayed. These results are characteristic of 
15 a complex insertion at a single site. All seedlings 
containing the insert also expressed toxin protein. 

Example 6 

Transformation Of Rice With a Vector Carrying Plasmid 
20 pDAB1553 Encoding Photorhabdus Toxins 

A, Plasmid pDAB1553 

Plasmid pDAB1553 containing tcdA driven by the maize 

ubiquitinl promoter and hpt (hygromycin 

25 phosphotransferase providing resistance to the antibiotic 

hygromycin) under the control of 35T (a modified 35S 

promoter), was used for transformation. 

Preparation of rice transformation .vectors was 
30 accomplished in two steps. First, a modified plant- 
optimized tcdA coding region was ligated into a rice 
plant expression cassette plasmid. In this step, the 
coding region was placed under the transcriptional 
control of a promoter functional in plant cells. RNA 
35 transcription termination and polyadenylation were 
mediated by a downstream copy of the terminator region 
from the Agrobacteriu/n nopaline synthase gene. One 
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plasmid designed to function in this role is plasmid 
PDAB1538- (described in the section on maize 
transformation vectors) . In the second step, the 
complete gene comprised of the promoter, coding region, 
5 and terminator region was ligated to a rice plant 
transformation vector that contained a plant expressible 
selectable marker gene which allowed the' selection of 
transformed rice plant cells amongst a background of 
nontransformed cells. An example of such a vector is 
10 pDAB354-Notl. 

It is a feature of pDAB354-Notl that the hygromycin 
phosphotransferase protein, which has as its substrate 
hygromycin B and related compounds, is produced in plant 
cells through transcription of its coding region mediated 
15 by the Cauliflower Mosaic Virus 35S promoter and that 
termination of transcription plus polyadenylation are 
mediated by the nopaline synthase terminator region." It 
is further a feature of pDAB354-Notl that any DNA 
fragment containing flanking NotI sites can be cloned 
20 into the unique NotI site of pDAB354-Notl, thus 

physically linking the introduced DNA fragment to the 
aforementioned selectable marker gene. 

To prepare a plant-expressible gene to produce the 
non-targeted TcdA protein in rice plant cells, DNA of a 
25 plasmid (pA0H_4-OPTI ) containing the plant-optimized tcdA 
coding region, (SEQ ID No: 3) was cleaved with restriction 
enzymes Ncol and Sad, and the large 7550 bp fragment was 
ligated to similarly-cut DNA of plasmid pDAB1538 to 
produce plasmid pDAB1551. DNA of pDAB1551 was then 
30 digested with NotI, and the large 9933 bp fragment was 
ligated to NotI digested DNA of pDAB354-Notl to produce 
plasmid pDAB1553. 

It is a feature of plasmid pDAB1553 that the ubil 
and 35S promoters are encoded on the same DNA strand. 

35 B. Production of Rice transgenics 
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For initiation of embryogenic callus, mature seeds 
of a Japonica cultivar, Taipei 309 were dehusked and 
surface-sterilized in 70% ethanol for 2-5 min. followed 
by a 30-45 min soak in 50% commercial bleach (2.6% sodium 
5 hypochlorite) with a few drops of 'Liquinox' soap. The 
seeds were then rinsed 3 times in sterile distilled water 
and placed on filter paper before transferring to 'callus 
induction 1 medium (i.e., NB) . The NB medium consisted of 
N6 macro elements (Chu, 1978, The N6. medium and its 
10 application to anther culture of cereal crops. Proc. 
Symp. Plant Tissue Culture, Peking Press, p43-56), B5 
micro elements and vitamins (Gamborg et al., 1968, 
Nutrient requirements of suspension cultures of soybean 
root cells. Exp. Cell Res. 50: 151-158), 300 mg/L casein 
15 hydrolysate, 500 mg/L L-proline, 500 mg/L L-glutamine, 30 
g/L sucrose, 2 mg/L 2, 4-dichloro-phenoxyacetic acid (2,4- 
D), and 2.5 g/L gelrite (Schweizerhall, NJ) with the pH 
adjusted to 5.8. The mature seed cultured on 'induction 1 
media were incubated in the dark at 28°C. After 3 weeks 
20 of culture, the emerging primary callus induced from the 
scutellar region of mature embryo was transferred to 
fresh NB medium for further maintenance. 

About 140 pg of plasmid pDAB1553 DNA was 
precipitated onto 60 mg of 1.0 micron (Bio-Rad) gold 
25 particles as described herein. 

For helium blasting, actively growing embryogenic 
callus cultures, 2-4 mm in size, were subjected to a high 
osmoticum treatment. This treatment included placing of 
callus on NB medium with 0.2 M mannitol and 0.2 M 
30 sorbitol (Vain et al., 1993, Osmoticum treatment enhances 
particle bombardment-mediated transient and stable 
transformation of maize. Plant Cell Rep. 12: 84-88) for 
4 h before helium blasting. Following osmoticum 
treatment, callus cultures were transferred to •blasting 1 
35 medium (NB+2% agar) and covered with a stainless steel 

screen (230 micron) . The callus cultures were blasted at 
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2,000 psi helium pressures twice per target. After 
blasting, callus was transferred back to the media with 
high osmoticum overnight before placing on selection 
medium, which consisted NB medium with 30 mg/L 
5 hygromycin. After 2 weeks, the cultures were transferred 
to fresh selection medium with a higher concentration of 
selection agent, i.e., NB+50mg/L hygromycin (Li et al., 
1993, An improved rice transformation system using the 
biolistic method. Plant Cell Rep. 12: 250-255). 
10 Compact, white-yellow, embryogenic callus cultures, 

recovered on NB+50 mg/L hygromycin, were regenerated by 
transferring to ' pre-regeneration ' (PR) medium + 50 mg/L 
hygromycin. The PR medium consisted of NB medium with 2 
mg/L benzyl aminopurine (BAP), 1 mg/L naphthalene acetic 
15 acid (NAA), and 5 mg/L abscisic acid (ABA). After 2 

weeks of culture in the dark, they were transferred to 
'regeneration 1 (RN) medium . The composition of RN 
medium is NB medium with 3 mg/L BAP, and 0.5 mg/L NAA. 
The cultures on RN medium were incubated for 2 weeks at • 
20 28° C under high fluorescent light (325-ft-candles) . The 
plantlets with 2 cm shoot were transferred to 1/2 MS 
medium (Murashige and Skoog, 1962, A revised medium for 
rapid growth and bioassays with tobacco tissue cultures. 
Physiol. Plant. 15:473-497) with 1/2 B5 vitamins, 10 g/L 
25 sucrose, 0.05 mg/L NAA, 50 mg/L hygromycin and 2.5 g/L 
gelrite adjusted to pH 5.8 in magenta boxes. When 
plantlets were established with well-developed root 
systems, they were transferred to soil (1 metromix: 1 top 
soil) and raised in the greenhouse (29/24°C day/night 
30 cycle, 50-60% humidity, 12 h photoperiod) until maturity. 



EXAMPLE 7 

Chacterization Of Transgenic Rice Plants Expressing 
35 Photorhabdus Toxin That Confer Insect Control. 
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A. Insect bioassays 

Insect bioassays were performed using leaf discs and 
shown to be highly effective in controlling Southern corn 
rootworm. Diabrotica undecimpunctata howardi eggs are 
5 obtained from. French Ag Research and hatched in petri 
dishes held at 28.5°C and 40% RH. The aerial parts are 
sampled from the transgenic plants and placed, singly 
into inverted petri dishes (100x15mm) containing 15ml of 
1.6% aqueous agar in the bottom to provide humidity and 

10 filter paper in the top to absorb condensation. These 
preparations are infested with five neonate larvae per 
dish and held at 28.5°C and 40% RH for 3 days. Mortality 
and larval weights are recorded. Weight data were 
transformed using a logarithmic function to correct a 

15 correlation between the magnitude of the mean and 
variance. 



Table 11 



Treatment 


Average Survivor 
Weight in mg 1 
(Duncan's 
Grouping) 


Presence TcdA greenhouse-grown 
plants (number of+/number of plants 
tested) 


GUS 
Control 


0.390 A 




1553-33 


0.170 BCD 


++ 


1553-44 


0.167 BCD 


+++ 


1553-62 


0.125 CD 


+++ 


1553-41 


0.100 D 


+-H- 



Note: Means followed by the same letter are 
not significantly different based on Duncan's 



20 multiple range test (alpha=0 . 05) . 

Insect groups weighing less than 0.1 mg were set to 0.03 mg 

instead of zero to conduct a more conservative analysis. 

Weight data were transformed (LoglO) for analyses. A single 

replicate was used on each of three test dates. Plants were 

25 sampled from magenta boxes. 

The results demonstrate that in leaf disc bioassays, several 

rice events derived by transformation with tcdA gene were 

demonstrated to statistically have a functional affect on 

corn rootworm neonate. 

30 
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Claims 

1. An isolated nucleic acid of SEQ ID NO: 3 or SEQ ID 
NO: 4. 

2. A transgenic monocot cell having a genome comprising 
5 SEQ ID NO: 3 or SEQ ID NO: 4. 

3. A transgenic dicot cell having a genome comprising 
SEQ ID NO: 3 or SEQ ID NO: 4. 

4 . A transgenic plant with a genome comprising a 
nucleic acid of SEQ ID NO: 3 or SEQ ID NO: 4 that imparts 
10 insect resistance. 

5. A transgenic plant of claim 4 wherein the plant is 
rice. 

6. A transgenic plant of claim 4 wherein the plant is 
maize. 

15 7 • A transgenic plant of claim 4 wherein the plant is 
tobacco. 
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SEQUENCE LISTING 

<110> Petell, Jim 

Merlo, Donald 
Herman, Rod 
Roberts, Jean 
Guo, Lining 
Schafer, Barry 
Sukhapinda, Kitisri 
Owens Merlo, Ann 

<120> Transgenic Plants Expressing Photorhabdus Toxin 

<130> 50698 

<140> 
<141> 

<150> US 60/148,356 
<151> 1999-08-11 

<160> 8 

<170> Patentln Ver. 2.0 

<210> 1 

<211> 7551 

<212> DNA 

<213> Photorhabdus luminescens 

<220> 
<221> CDS 
<222> (1) . . (7548) 

<400> 1 

atg aac gag tct gta 

Met Asn Glu Ser Val 

1 5 

ggt ttt aat tgt ctg 
Gly Phe Asn Cys Leu 
20 

cgc cag caa gta tct 
Arg Gin Gin Val Ser 
35 

tat cat gat gca caa 
Tyr His Asp Ala Gin 
50 

cgt att etc aaa cgc 
Arg He Leu Lys Arg 
65 

gec att etc get ccc 
Ala He Leu Ala Pro 
85 

age ggt aga gec agt 
Ser Gly Arg Ala Ser 



aaa gag ata cct gat gta tta aaa age cag tgt 48 
Lys Glu He Pro Asp Val Leu Lys Ser Gin Cys 
10 15 . 

aca gat att age cac age tct ttt aat gaa ttt 96 
Thr Asp He Ser His Ser Ser Phe Asn Glu Phe 
25 30 

gag cac etc tec tgg tec gaa aca cac gac tta 144 
Glu His Leu Ser Trp Ser Glu Thr His Asp Leu 
40 45 

cag gca caa aag gat aat cgc ctg tat gaa gcg 192 
Gin Ala Gin Lys Asp Asn Arg Leu Tyr Glu Ala 
55 60 

gee aat ccc caa tta caa aat gcg gtg cat ctt 240 
Ala Asn Pro Gin Leu Gin Asn Ala Val His Leu 
70 75 80 

aat get gaa ctg ata ggc tat aac aat caa ttt 288 
Asn Ala Glu Leu He Gly Tyr Asn Asn Gin Phe 
90 95 

caa tat gtt gcg ccg ggt ace gtt tct tec atg 336 
Gin Tyr Val Ala Pro Gly Thr Val Ser Ser Met 



1 
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100 105 no 

ttc tec ccc gec get tat ttg act gaa ctt tat cgt gaa gca cgc aat 384 
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg Asn 
115 120 125 

tta cac gca agt gac tec gtt tat tat ctg gat acc cgc cgc cca gat 4 32 
Leu His Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg Pro Asp 
130 135 140 

etc aaa tea atg gcg etc agt cag caa aat atg gat ata gaa tta tec 4 80 
Leu Lys Ser Met Ala Leu Ser Gin Gin Asn Met Asp He Glu Leu Ser 
145 150 155 160 

aca etc tct ttg tec aat gag ctg tta ttg gaa age att aaa act gaa 528 
Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser He Lys Thr Glu 
165 170 175 

tct aaa ctg gaa aac tat act aaa gtg atg gaa atg etc tec act ttc 576 
Ser Lys Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser Thr Phe 
180 185 190 

cgt cct tec ggc gca acg cct tat cat gat get tat gaa aat gtg cgt 624 
Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Arg 
195 200 205 

gaa gtt ate cag eta caa gat cct gga ctt gag caa etc aat gca tea 672 
Glu Val He Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser 
210 215 220 

ccg gca att gec ggg ttg atg cat caa gee tec eta ttg ggt att aac 720 
Pro Ala He Ala Gly Leu Met His Gin Ala Ser Leu Leu Gly He Asn 
225 230 235 240 

get tea ate teg cct gag eta ttt aat att ctg acg gag gag att acc 768 
Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu He Thr 
245 250 255 

gaa ggt aat get gag gaa ctt tat aag aaa aat ttt ggt aat ate gaa 816 
Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn He Glu 
260 265 270 

ccg gee tea ttg get atg ccg gaa tac ctt aaa cgt tat tat aat tta 864 
Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 
215 280 285 

age gat gaa gaa ctt agt cag ttt att ggt aaa gee age aat ttt ggt 912 
Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn Phe Gly 
290 295 300 



caa cag gaa tat agt aat aac caa ctt att act ccg gta gtc aac age 
Gin Gin Glu Tyr Ser Asn Asn Gin Leu He Thr Pro Val Val Asn Ser 
305 310 315 320 

agt gat ggc acg gtt aag gta tat egg ate acc cgc gaa tat aca acc 
Ser Asp Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr Thr Thr 
325 330 " 335 

aat get tat caa atg gat gtg gag eta ttt ccc ttc ggt ggt gag aat 
Asn Ala Tyr Gin Met Asp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn 
340 345 350 



960 



1008 



1056 
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tat egg tta gat tat aaa ttc aaa aat ttt tat aat gec tct tat tta 1104 

Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 
355 " 360 365 

tec ate aag tta aat gat aaa aga gaa ctt gtt cga act gaa ggc get 1152 

Ser He Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Ala 

370 375 380 

cct caa gtc aat ata gaa tac tec gca aat ate aca tta aat ace get 1200 

Pro Gin Val Asn lie Glu Tyr Ser Ala Asn He Thr Leu Asn Thr Ala 
385 390 395 400 

gat ate agt caa cct ttt gaa att ggc ctg aca cga gta ctt cct tec 1248 

Asp He Ser Gin Pro Phe Glu He Gly Leu Thr Arg Val Leu Pro Ser 
4 05 410 415 

ggt tct tgg gca tat gee gee gca aaa ttt acc gtt gaa gag tat aac 1296 

Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn 
420 " 425 430 

caa tac tct ttt ctg eta aaa ctt aac aag get att cgt eta tea cgt 1344 

Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala He Arg Leu Ser Arg 
435 440 445 

gcg aca gaa ttg tea ccc acg att ctg gaa ggc att gtg cgc agt gtt 1392 

Ala Thr Glu Leu Ser Pro Thr lie Leu Glu Gly He Val Arg Ser Val 

450 455 460 

aat eta caa ctg gat ate aac aca gac gta tta ggt aaa gtt ttt ctg 1440 

Asn Leu Gin Leu Asp He Asn Thr Asp Val Leu Gly Lys Val Phe Leu 
465 470 475 480 



act aaa tat tat atg cag cgt tat get att cat get gaa act gee ctg 
Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr Ala Leu 
485 4 90 4 95 



age caa ttt gat cgc ctg ttt aat acg cca tta ctg aac gga caa tat 

Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gin Tyr 

515 ^ 520 525 

ttt tct acc ggc gat gag gag att gat tta aat tea ggt age acc ggc 

Phe Ser Thr Gly Asp Glu Glu He Asp Leu Asn Ser Gly Ser Thr Gly 

530 535 540 

gat tgg cga aaa acc ata ctt aag cgt gca ttt aat att gat gat gtc 

Asp Trp Arg Lys Thr He Leu Lys Arg Ala Phe Asn lie Asp Asp Val 

545 550 555 560 

teg etc ttc cgc ctg ctt aaa att acc gac cat gat aat aaa gat gga 

Ser Leu Phe Arg Leu Leu Lys lie Thr Asp His Asp Asn Lys Asp Gly 

565 570 575 



1488 



ata eta tgc aac gcg cct att tea caa cgt tea tat gat aat caa cct 1536 
He Leu Cys Asn Ala Pro lie Ser Gin Arg Ser Tyr Asp Asn Gin Pro 
500 505 510 



1584 



1632 



1680 



1728 



aaa att aaa aat aac eta aag aat ctt tec aat tta tat att gga aaa 1776 
Lys He Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr He Gly Lys 
580 585 590 
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tta ctg gca gat att cat caa tta acc att gat gaa ctg gat tta tta 
Leu Leu Ala Asp He His Gin Leu Thr He Asp Glu Leu Asp Leu Leu 
595 600 605 



1824 



ctg att gcc gta ggt gaa gga aaa act aat tta tec get ate agt gat 1872 
Leu He Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala He Ser Asp 
610 615 620 

aag caa ttg get acc ctg ate aga aaa etc aat act att acc age tgg 1920 
Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr Ser Trp 
625 630 635 640 

eta cat aca cag aag tgg agt gta ttc cag eta ttt ate atg acc tec 1968 
Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe He Met Thr Ser 
645 650 655 

acc age tat aac aaa acg eta acg cct gaa att aag aat ttg ctg gat 2016 
Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu He Lys Asn Leu Leu Asp 
660 665 670 

acc gtc tac cac ggt tta caa ggt ttt gat aaa gac aaa gca gat ttg 2064 
Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala Asp Leu 
675 680 685 

eta cat gtc atg gcg ccc tat att gcg gcc acc ttg caa tta tea teg 2112 
Leu His Val Met Ala Pro Tyr He Ala Ala Thr Leu Gin Leu Ser Ser 
690 695 700 

gaa aat gtc gcc cac teg gta etc ctt tgg gca gat aag tta cag ccc 2160 
Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu Gin Pro 
705 710 715 720 

ggc gac ggc gca atg aca gca gaa aaa ttc tgg gac tgg ttg aat act 2208 
Gly Asp Gly Ala Met Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr 
725 730 A 735 

aag tat acg ccg ggt tea teg gaa gcc gta gaa acg cag gaa cat ate 2256 
Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His He 
740 745 750 

gtt cag tat tgt cag get ctg gca caa ttg gaa atg gtt tac cat tec 2304 
Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Met Val Tyr His Ser 
755 760 765 

acc ggc ate aac gaa aac gcc ttc cgt eta ttt gtg aca aaa cca gag 2352 
Thr Gly He Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys Pro Glu 
770 775 780 

atg ttt ggc get gca act gga gca gcg ccc gcg cat gat gcc ctt tea 24 00 
Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Leu Ser 
785 790 795 800 

ctg att atg ctg aca cgt ttt gcg gat tgg gtg aac gca eta ggc gaa 24 48 
Leu He Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Gly Glu 
805 810 815 

aaa gcg tec teg gtg eta gcg gca ttt gaa get aac teg tta acg gca 24 96 
Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ala 
820 825 830 

gaa caa ctg get gat gcc atg aat ctt gat get aat ttg ctg ttg caa 2544 
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Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu Gin 

835 , 840 845 

gcc agt att caa gca caa aat cat caa cat ctt ccc cca gta act cca 

Ala Ser He Gin Ala Gin Asn His Gin His Leu Pro Pro Val Thr Pro 

850 855 860 

gaa aat gcg ttc tec tgt tgg aca tct ate aat act ate ctg caa tgg 

Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr He Leu Gin Trp 

865 870 875 880 

gtt aat gtc gca caa caa ttg aat gtc gcc cca cag ggc gtt tec get 

Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser Ala 

885 890 895 



aat teg ggg gtt ate age cgc caa ttc ttt ate gac tgg gac aaa tac 
Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp Lys Tyr 
1010 1015 1020 



2592 



2640 



2688 



ttg gtc ggg ctg gat tat att caa tea atg aaa gag aca ccg ace tat 2736 

Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro Thr Tyr 
900 905 910 

gcc cag tgg gaa aac gcg gca ggc gta tta acc gcc ggg ttg aat tea 2784 

Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 
915 920 925 

caa cag get aat aca tta cac get ttt ctg gat gaa tct cgc agt gcc 2832 

Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg Ser Ala 
930 935 940 

gca tta age acc tac tat ate cgt caa gtc gcc aag gca gcg gcg get 

Ala Leu Ser Thr Tyr Tyr He Arg Gin Val Ala Lys Ala Ala Ala Ala 

945 950 955 960 

att aaa age cgt gat gac ttg tat caa tac tta ctg att gat aat cag 
He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp Asn Gin 
965 970 975 

gtt tct gcg gca ata aaa acc acc egg ate gcc gaa gcc att gcc agt 
Val Ser Ala Ala He Lys Thr Thr Arg He Ala Glu Ala He Ala Ser 
980 985 990 

att caa ctg tac gtc aac egg gca ttg gaa aat gtg gaa gaa aat gcc 3024 

He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 
995 1000 1005 



2880 



2928 



2976 



3072 



aat aaa cgc tac age act tgg gcg ggt gtt tct caa tta gtt tac tac 3120 

Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 
1025 " 1030 1035 1040 

ccg gaa aac tat att gat ccg acc atg cgt ate gga caa acc aaa atg 3168 

Pro Glu Asn Tyr He Asp Pro Thr Met Arg He Gly Gin Thr Lys Met 
1045 1050 1055 

atg gac gca tta ctg caa tec gtc age caa age caa tta aac gcc gat 3216 

Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 
1060 1065 1070 



acc gtc gaa gat gcc ttt atg tct tat ctg aca teg ttt gaa caa gtg 
Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 



3264 
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1075 1080 1085 

get aat ctt aaa gtt att age gca tat cac gat aat att aat aac gat 3312 
Ala Asn Leu Lys Val lie Ser Ala Tyr His Asp Asn He Asn Asn Asp 
1090 1095 1100 

caa ggg ctg acc tat ttt ate gga etc agt gaa act gat gee ggt gaa 3360 
Gin Gly Leu Thr Tyr Phe lie Gly Leu Ser Glu Thr Asp Ala Gly Glu 
H05 mo ins 1120 

tat tat tgg cgc agt gtc gat cac agt aaa ttc aac gac ggt aaa ttc 3408 
Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe 
H25 H30 H35 

gcg get aat gec tgg agt gaa tgg cat aaa att gat tgt cca att aac 3456 
Ala Ala Asn Ala Trp Ser Glu Trp His Lys He Asp Cys Pro He Asn 
1140 H45 H50 

cct tat aaa age act ate cgt cca gtg ata tat aaa tec cgc ctg tat 3504 
Pro Tyr Lys Ser Thr He Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 
1155 H60 H65 

ctg etc tgg ttg gaa caa aag gag ate acc aaa cag aca gga aat agt 3552 
Leu Leu Trp Leu Glu Gin Lys Glu He Thr Lys Gin Thr Gly Asn Ser 
1170 H75 H80 

aaa gat ggc tat caa act gaa acg gat tat cgt tat gaa eta aaa ttg 3600 
Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 
1185 H90 H95 1200 

gcg cat ate cgc tat gat ggc act tgg aat acg cca ate acc ttt gat 3648 
Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro He Thr Phe Asp 
1205 1210 1215 

gtc aat aaa aaa ata tec gag eta aaa ctg gaa aaa aat aga gcg ccc 3696 
Val Asn Lys Lys He Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 
1220 1225 1230 

gga etc tat tgt gec ggt tat caa ggt gaa gat acg ttg ctg gtg atg 374 4 
Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 
1235 . 1240 1245 

ttt tat aac caa caa gac aca eta gat agt tat aaa aac get tea atg 3792 
Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser Met 
1250 1255 1260 

caa gga eta tat ate ttt get gat atg gca tec aaa gat atg acc cca 3840 
Gin Gly Leu Tyr He Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 
1265 1270 1275 ~ 1280 

gaa cag age aat gtt tat egg gat aat age tat caa caa ttt gat acc 3888 
Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 
1285 1290 1295 

aat aat gtc aga aga gtg aat aac cgc tat gca gag gat tat gag att 3936 
Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu He 
1300 1305 1310 

cct tec teg gta agt age cgt aaa gac tat ggt tgg gga gat tat tac 3984 
Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 
i3 l 5 1320 1325 
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etc age atg gta tat aac gga gat att cca act ate aat tac aaa gec 4032 
Leu Ser Met Val Tyr Asn Gly Asp He Pro Thr He Asn Tyr Lys Ala 
1330 1335 1340 

gca tea agt gat tta aaa ate tat ate tea cca aaa tta aga att att 4080 
Ala Ser Ser Asp Leu Lys He Tyr He Ser Pro Lys Leu Arg lie He 
1345 1350 1355 1360 

cat aat gga tat gaa gga cag aag cgc aat caa tgc aat ctg atg aat 4128 
His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 
1365 1370 1375 

aaa tat ggc aaa eta ggt gat aaa ttt att gtt tat act age ttg ggg 4176 
Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser Leu Gly 
1380 1385 1390 

gtc aat cca aat aac teg tea aat aag etc atg ttt tac ccc gtc tat 4224 
Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 
1395 1400 1405 

caa tat age gga aac ace agt gga etc aat caa ggg aga eta eta ttc 4272 
Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 
1410 1415 1420 

cac cgt gac ace act tat cca tct aaa gta gaa get tgg att cct gga 4 320 
His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp He Pro Gly 
1425 1430 1435 1440 

gca aaa cgt tct eta acc aac caa aat gee gee att ggt gat gat tat 4 368 
Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Gly Asp Asp Tyr 
1445 1450 1455 

get aca gac tct ctg aat aaa ccg gat gat ctt aag caa tat ate ttt 4416 
Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr He Phe 
1460 1465 1470 

atg act gac agt aaa ggg act get act gat gtc tea ggc cca gta gag 4 464 
Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 
1475 1480 , 1485 

att aat act gca att tct cca gca aaa gtt cag ata ata gtc aaa gcg 4512 
He Asn Thr Ala He Ser Pro Ala Lys Val Gin lie lie Val Lys Ala 
1490 1495 1500 

ggt ggc aag gag caa act ttt acc gca gat aaa gat gtc tec att cag 4560 
Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser lie Gin 
1505 1510 1515 1520 

cca tea cct age ttt gat gaa atg aat tat caa ttt aat gee ctt gaa 4 608 
Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gin Phe Asn Ala Leu Glu 
1525 1530 1535 

ata gac ggt tct ggt ctg aat ttt att aac aac tea gee agt att gat 4 656 
lie Asp Gly Ser Gly Leu Asn Phe lie Asn Asn Ser Ala Ser He Asp 
1540 1545 1550 

gtt act ttt acc gca ttt gcg gag gat ggc cgc aaa ctg ggt tat gaa 4704 
Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tyr Glu 
1555 1560 1565 
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agt ttc agt att cct gtt acc etc aag gta agt acc gat aat gec ctg 4752 
Ser Phe Ser lie Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 
1570 1575 1580 

acc ctg cac cat aat gaa aat ggt gcg caa tat atg caa tgg caa tec 4800 
Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 
1585 1590 1595 1600 

tat cgt acc cgc ctg aat act eta ttt gec cgc cag ttg gtt gca cgc 4848 
Tyr, Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 
1605 1610 1615 

gec acc acc gga ate gat aca att ctg agt atg gaa act cag aat att 4896 
Ala Thr Thr Gly He Asp Thr He Leu Ser Met Glu Thr Gin Asn He 
1620 1625 . 1630 



cag gaa ccg cag tta ggc aaa ggt ttc tat get acg ttc gtg ata cct 4 94 4 
Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val He Pro 
1635 1640 1645 

ccc tat aac eta tea act cat ggt gat gaa cgt tgg ttt aag ctt tat 4 992 
Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 
1650 1655 1660 

ate aaa cat gtt gtt gat aat aat tea cat att ate tat tea ggc cag 5040 
He Lys His Val Val Asp Asn Asn Ser His He He Tyr Ser Gly Gin 
1665 1670 1675 1680 

eta aca gat aca aat ata aac ate aca tta ttt att cct ctt gat gat 5088 
Leu Thr Asp Thr Asn He Asn He Thr Leu Phe He Pro Leu Asp Asp 
1685 1690 1695 

gtc cca ttg aat caa gat tat cac gee aag gtt tat atg acc ttc aag 5136 
Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 
1700 1705 1710 

aaa tea cca tea gat ggt acc tgg tgg ggc cct cac ttt gtt aga gat 5184 
Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 
1715 1720 1725 

gat aaa gga ata gta aca ata aac cct aaa tec att ttg acc cat ttt 5232 
Asp Lys Gly He Val Thr He Asn Pro Lys Ser lie Leu Thr His Phe 
1730 1735 1740 

gag age gtc aat gtc ctg aat aat att agt age gaa cca atg gat ttc 5280 
Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Met Asp Phe 
l" 745 1750 1755 1760 

age ggc get aac age etc tat ttc tgg gaa ctg ttc tac tat acc ccg 5328 
Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 
1765 1770 1775 

atg ctg gtt get caa cgt ttg ctg cat gaa cag aac ttc gat gaa gee 5376 
Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 
1780 1785 1790 

aac cgt tgg ctg aaa tat gtc tgg agt cca tec ggt tat att gtc cac 5424 
Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He Val His 
1795 1800 1805 

ggc cag att cag aac tac cag tgg aac gtc cgc ccg tta ctg gaa gac 5472 
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Gly Gin He Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 
1810 1815 1820 

acc agt tgg aac agt gat cct ttg gat tec gtc gat cct gac gcg gta 
Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 
1825 1830 1835 1840 

gca cag cac gat cca atg cac tac aaa gtt tea act ttt atg cgt acc 
Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 
1845 1850 1855 

ttg gat eta ttg ata gca cgc ggc gac cat get tat cgc caa ctg gaa 
Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 
I860 1865 1870 

cga gat aca etc aac gaa gcg aag atg tgg tat atg caa gcg ctg cat 
Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala Leu His 
1875 1880 1885 

eta tta ggt gac aaa cct tat eta ccg ctg agt acg aca tgg agt gat 
Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 
1890 1895 1900 

cca cga eta gac aga gee gcg gat ate act acc caa aat get cac gac 
Pro Arg Leu Asp Arg Ala Ala Asp He Thr Thr Gin Asn Ala His Asp 
1905 1910 1915 1920 

age gca ata gtc get ctg egg cag aat ata cct aca ccg gca cct tta 
Ser Ala He Val Ala Leu Arg Gin Asn He Pro Thr Pro Ala Pro Leu 
1925 1930 1935 

tea ttg cgc age get aat acc ctg act gat etc ttc ctg ccg caa ate 
Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He 
1940 1945 1950 

aat gaa gtg atg atg aat tac tgg cag aca tta get cag aga gta tac 
Asn Glu Val Met Met Asn Tyr "Trp Gin Thr Leu Ala Gin Arg Val Tyr 
1955 I960 1965 

aat ctg cgt cat aac etc tct ate gac ggc cag ccg tta tat ctg cca 
Asn Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro 
1970 ~ 1975 1980 

ate tat gec aca ccg gee gat ccg aaa gcg tta etc age gee gee gtt 
He Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 
1985 1990 1995 2000 

gec act tct caa ggt gga ggc aag eta ccg gaa tea ttt atg tec ctg 
Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 
2005 2010 2015 

tgg cgt ttc ccg cac atg ctg gaa aat gcg cgc ggc atg gtt age cag 
Trp Arg Phe ■ Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin 
2020 2025 2030 

etc acc cag ttc ggc tec acg tta caa aat att ate gaa cgt cag gac 
Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He He Glu Arg Gin Asp 
2035 2040 2045 

gcg gaa gcg etc aat gcg tta tta caa aat cag gec gee gag ctg ata 
Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He 



5520 



5568 



5616 



5664 



5712 



5760 



5808 



5856 



5904 



5952 



6000 



6048 



6096 



6144 



6192 
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2050 



2055 



2060 



ttg act aac ctg age att cag gac aaa acc att gaa gaa ttg gat gec 
Leu Thr Asn Leu Ser lie Gin Asp Lys Thr He Glu Glu Leu Asp Ala 
2065 2070 2075 2080 

gag aaa acg gtg ttg gaa aaa tec aaa gcg gga gca caa teg cgc ttt 
Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 
2085 2090 2095 



6240 



6288 



gat age tac ggc aaa ctg tac gat gag aat ate aac gee ggt gaa aac 6336 
Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn 
2100 2105 2110 

caa gec atg acg eta cga gcg tec gec gee ggg ctt acc acg gca gtt 6384 
Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 
2115 2120 2125 



cag gca tec cgt ctg gee ggt gcg gcg get gat ctg gtg cct aac ate 64 32 
Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 
2130 2135 2140 

ttc ggc ttt gec ggt ggc ggc age cgt tgg ggg get ate get gag gcg 64 80 
Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 
21^5 2150 2155 2160 

aca ggt tat gtg atg gaa ttc tec gcg aat gtt atg aac acc gaa gcg 6528 
Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 
2165 2170 2175 

gat aaa att age caa tct gaa acc tac cgt cgt cgc cgt cag gag tgg 657 6 
Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 
2180 2185 2190 

gag ate cag egg aat aat gec gaa gcg gaa ttg aag caa ate gat get 6624 
Glu He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin lie Asp Ala 
2195 2200 2205 

cag etc aaa tea etc get gta cgc cgc gaa gee gee gta ttg cag aaa 6672 
Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 
2210 2215 2220 

acc agt ctg aaa acc caa caa gaa cag acc caa tct caa ttg gee ttc 6720 
Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 
2225 2230 2235 2240 

ctg caa cgt aag ttc age aat cag gcg tta tac aac tgg ctg cgt ggt 67 68 
Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 
2245 2250 ~ 2255 

cga ctg gcg gcg att tac ttc cag ttc tac gat ttg gee gtc gcg cgt 6816 
Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 
2260 2265 2270 

tgc ctg atg gca gaa caa get tac cgt tgg gaa etc aat gat gac tct 6864 
Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 
2275 2280 2285 

gee cgc ttc att aaa ccg ggc gee tgg cag gga acc tat gee ggt ctg 6912 
Ala Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 
2290 2295 2300 
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ctt gca ggt gaa acc ttg atg ctg agt ctg gca caa atg gaa gac get 6960 
Leu Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 
2305 2310 2315 2320 

cat ctg aaa cgc gat aaa cgc gca tta gag gtt gaa cgc aca gta teg 7008 
His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 
2325 2330 2335 



ctg gec gaa gtt tat gca gga tta cca aaa gat aac ggt cca ttt tec 
Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 
2340 2345 2350 



7056 



ctg get cag gaa att gac aag ctg gtg agt caa ggt tea ggc agt gec 
Leu Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 
2355 2360 2365 



7104 



ggc agt ggt aat aat aat ttg gcg ttc ggc gec ggc acg gac act aaa 7152 
Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 
2370 ~ 2375 2380 

acc tct ttg cag gca tea gtt tea ttc get gat ttg aaa att cgt gaa 7200 
Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu 
2385 2390 2395 2400 



gat tac ccg gca teg ctt ggc aaa att cga cgt ate aaa cag ate age 
Asp Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser 
2405 2410 2415 



7248 



gtc act ttg ccc gcg eta ctg gga ccg tat cag gat gta cag gca ata 
Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He 
2420 2425 2430 



7296 



ttg tct tac ggc gat aaa gee gga tta get aac ggc tgt gaa gcg ctg 
Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu 
2435 2440 2445 



7344 



gca gtt tct cac ggt atg aat gac age ggc caa ttc cag etc gat ttc 
Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 
2450 2455 2460 



7392 



aac gat ggc aaa ttc ctg cca ttc gaa ggc ate gec att gat caa ggc 
Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly 
2465 2470 2475 2480 



7440 



acg ctg aca ctg age ttc cca aat gca tct atg ccg gag aaa ggt aaa 
Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys 
2485 2490 2495 



7488 



caa gee act atg tta aaa acc ctg aac gat ate att ttg cat att cgc 
Gin Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg 
2500 2505 2510 



7536 



tac acc att aaa taa 
Tyr Thr He Lys 
2515 



7551 



<210> 2 
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<211> 7515 
<212> DNA 

<213> Photorhabdus luminescens 

<220> 

<221> CDS 

<222> (1) . . (7512) 

<400> 2 

atg caa aac tea tta tea age act ate gat act att tgt cag aaa ctg 4 8 

Met Gin Asn Ser Leu Ser Ser Thr lie Asp Thr lie Cys Gin Lys Leu 
1 5 10 15 

caa tta act tgt ccg gcg gaa att get ttg tat ccc ttt gat act ttc 96 
Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 
20 25 30 

egg gaa aaa act egg gga atg gtt aat tgg ggg gaa gca aaa egg att 14 4 
Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg He 
35 40 45 

tat gaa att gca caa gcg gaa cag gat aga aac eta ctt cat gaa aaa 192 
Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 
50 55 60 

cgt att ttt gee tat get aat ccg ctg ctg aaa aac get gtt egg ttg 240 
Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 
65 70 75 80 

ggt ace egg caa atg ttg ggt ttt ata caa ggt tat agt gat ctg ttt 288 
Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 
85 90 95 

ggt aat cgt get gat aac tat gec gcg ccg ggc teg gtt gca teg atg 336 
Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met 
100 105 HO 

ttc tea ccg gcg get tat ttg acg gaa ttg tac cgt gaa gec aaa aac 384 
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 
115 120 125 

ttg cat gac age age tea att tat tac eta gat aaa cgt cgc ccg gat 4 32 
Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 
130 135 140 

tta gca age tta atg etc age cag aaa aat atg gat gag gaa att tea 480 
Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu He Ser 
145 150 155 160 

acg ctg get etc tct aat gaa ttg tgc ctt gec ggg ate gaa aca aaa 528 
Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 
165 170 ^ 175 

aca gga aaa tea caa gat gaa gtg atg gat atg ttg tea act tat cgt 576 
Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg 
180 185 190 

tta agt gga gag aca cct tat cat cac get tat gaa act gtt cgt gaa 624 
Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arg Glu 
195 200 205 



12 



WO 01/11029 PCT/USOO/22237 



ate gtt cat gaa cgt gat cca gga ttt cgt cat ttg tea cag gca ccc 672 
He Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 
210 215 220 

att gtt get get aag etc gat cct gtg act ttg ttg ggt att age tec 
He Val Ala Ala Lys Leu Asp Pro, Val Thr Leu Leu Gly He Ser Ser 
225 230 235 240 

cat att teg cca gaa ctg tat aac ttg ctg att gag gag ate ccg gaa 
His He Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu 'He Pro Glu 
245 250 255 

aaa gat gaa gee gcg ctt gat acg ctt tat aaa aca aac ttt ggc gat 
Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 
260 265 270 

att act act get cag tta atg tec cca agt tat ctg gee egg tat tat 
He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr 
275 280 285 

ggc gtc tea ccg gaa gat att gee tac gtg acg act tea tta tea cat 
Gly Val Ser Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 
290 295 300 

gtt gga tat age agt gat att ctg gtt att ccg ttg gtc gat ggt gtg 
Val Gly Tyr Ser Ser Asp He Leu Val He Pro Leu Val Asp Gly Val 
305 310 315 320 

ggt aag atg gaa gta gtt cgt gtt ace cga aca cca teg gat aat tat 
Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 
325 330 335 

acc agt cag acg aat tat att gag ctg tat cca cag ggt ggc gac aat 
Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn 
340 345 350 

tat ttg ate aaa tac aat eta age aat agt ttt ggt ttg gat gat ttt 
Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe 
355 360 365 

tat ctg caa tat aaa gat ggt tec get gat tgg act gag att gec cat 
Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He Ala His 
370 - ■ - 38Q 

aat ccc tat cct gat atg gtc ata aat caa aag tat gaa tea cag gcg 
Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 
385 * 390 395 400 

aca ate aaa cgt agt gac tct gac aat ata etc agt ata ggg tta caa 
Thr He Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly Leu Gin 
405 410 415 

aga tgg cat age ggt agt tat aat ttt gec gee gee aat ttt aaa att 
Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys He 
420 425 430 

gac caa tac tec ccg aaa get ttc ctg ctt aaa atg aat aag get att 
Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala He 
435 440 445 

egg ttg etc aaa get acc ggc etc tct ttt get acg ttg gag cgt att 1392 



720 



768 



816 



864 



912 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



13 



WO 01/11029 PCT/USOO/22237 



Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arq He 
450 455 460 

gtt gat agt gtt aat age acc aaa tec ate acg gtt gag gta tta aac 1440 
Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 
465 470 475 480 

aag gtt tat egg gta aaa ttc tat att gat cgt tat ggc ate agt gaa 14 88 
Lys Val Tyr Arg Val Lys Phe Tyr He Asp Arg Tyr Gly He Ser Glu 
485 490 495 

gag aca gee get att ttg get aat att aat ate tct cag caa get gtt 1536 
Glu Thr Ala Ala He Leu Ala Asn He Asn He Ser Gin Gin Ala Val 
500 505 510- 

ggc aat cag ctt age cag ttt gag caa eta ttt aat cac ccg ccg etc 
Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 
515 520 525 



640 



aaa ata gtg gaa aca ttg ttg tgg ate act caa tgg ttg aag acc caa 
Lys He Val Glu Thr Leu Leu Trp lie Thr Gin Trp Leu Lys Thr Gin 
645 650 



655 



1584 



aat ggt att cgc tat gaa ate agt gag gac aac tec aaa cat ctt cct 1632 
Asn Gly He Arg Tyr Glu He Ser Glu Asp Asn Ser Lys His Leu Pro 
530 535 540 

aat cct gat ctg aac ctt aaa cca gac agt acc ggt gat gat caa cgc 1680 
Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arq 
545 550 555 560 

aag gcg gtt tta aaa cgc gcg ttt cag gtt aac gec agt gag ttg tat 1728 
Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu Leu Tyr 

565 • 570 ' 575 

cag atg tta ttg ate act gat cgt aaa gaa gac ggt gtt ate aaa aat 1776 
Gin Met Leu Leu He Thr Asp Arg Lys Glu Asp Gly Val He Lys Asn 
580 585 590 

aac tta gag aat ttg tct gat ctg tat ttg gtt agt ttg ctg gee cag 1824 
Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gin 
595 600 605 

att cat aac ctg act att get gaa ttg aac att ttg ttg gtg att tgt 1872 
He His Asn Leu Thr He Ala Glu Leu Asn He Leu Leu Val He Cys 
610 615 620 

ggc tat ggc gac acc aac att tat cag att acc gac gat aat tta gec 1920 
Gly Tyr Gly Asp Thr Asn lie Tyr Gin He Thr Asp Asp Asn Leu Ala 
625 630 635 



1968 



aaa tgg aca gtt acc gac ctg ttt ctg atg acc acg gee act tac age 
Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 
660 665 670 

acc act tta acg cca gaa att age aat ctg acg get acg ttg tct tea 
Thr Thr Leu Thr Pro Glu He Ser Asn Leu Thr Ala Thr Leu Ser Ser 
675 680 685 

act ttg cat ggc aaa gag agt ctg att ggg gaa gat ctg aaa aga gca 2112 
Thr Leu Has Gly Lys Glu Ser Leu He Gly Glu Asp Leu Lys Arg Ala 



2016 



2064 
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690 695 700 

atg gcg cct tgc ttc act teg get ttg cat ttg act tct caa gaa gtt 

Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val 

705 * 710 715 720 

gcg tat gac ctg ctg ttg tgg ata gac cag att caa ccg gca caa ata 

Ala Tyr Asp Leu Leu Leu Trp lie Asp Gin He Gin Pro Ala Gin He 

725 730 735 

act gtt gat ggg ttt tgg gaa gaa gtg caa aca aca cca acc age ttg 

Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 
740 745 750 

aag gtg att acc ttt get cag gtg ctg gca caa ttg age ctg ate tat 
Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu He Tyr 
755 760 765 



tct tct ctg eta gtg gca ggc aaa age ata ctg gat cac ggt ctg tta 
Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly Leu Leu 
785 790 795 800 



get gat cat get aat cag gca cag aaa aaa ctg gat gag acg ttc agt 
Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 
915 920 925 



2160 



2208 



2256 



2304 



cgt cgt att ggg tta agt gaa acg gaa ctg tea ctg ate gtg act caa 2352 
Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val Thr Gin 
770 " 775 780 



2400 



acc ctg atg gee ttg gaa ggt ttt cat acc tgg gtt aat ggc ttg ggg 2448 
Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 
805 810 815 

caa cat gee tec ttg ata ttg gcg gcg ttg aaa gac gga gee ttg aca 24 96 
Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 
820 825 830 

gtt acc gat gta gca caa get atg aat aag gag gaa tct etc eta caa 2544 
Val Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu Leu Gin 
835 840 845 

atg gca get aat cag gtg gag aag gat eta aca aaa ctg acc agt tgg 2592 
Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 
850 855 860 

aca cag att gac get att ctg caa tgg tta cag atg tct teg gee ttg 264 0 
Thr Gin He Asp Ala He Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 
865 870 875 880 

gcg gtt tct cca ctg gat ctg gca ggg atg atg gee ctg aaa tat ggg 2688 
Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly 
885 890 895 

ata gat cat aac tat get gee tgg caa get gcg gcg get gcg ctg atg 2736 
He Asp His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Met 
900 905 910 



2784 



aag gca tta tgt aac tat tat att aat get gtt gtc gat agt get get 2832 
Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val Asp Ser Ala Ala 
930 935 940 
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gga gta cgt gat cgt aac ggt tta tat acc tat ttg ctg att gat aat 2880 
Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu lie Asp Asn 
945 950 955 960 



cag gtt tct gcc gat gtg ate act tea cgt att gca gaa get ate gee 
Gin Val Ser Ala Asp Val lie Thr Ser Arg He Ala Glu Ala He Ala 
965 970 975 

ggt att caa ctg tac gtt aac egg get tta aac cga gat gaa ggt cag 
Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 
980 985 990 



2928 



2976 



ctt gca teg gac gtt agt acc cgt cag ttc ttc act gac tgg gaa cgt 3024 
Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 
995 1000 1005 

tac aat aaa cgt tac agt act tgg get ggt gtc tct gaa ctg gtc tat 3072 
Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 
1010 1015 1020 

tat cca gaa aac tat gtt gat ccc act cag cgc att ggg caa acc aaa 3120 
Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin Thr Lys 
1025 1030 1035 1040 

atg atg gat gcg ctg ttg caa tec ate aac cag age cag eta aat gcg 3168 
Met Met Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala 
1045 1050 1055 

gat acg gtg gaa gat get ttc aaa act tat ttg acc age ttt gag cag 3216 
Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 
1060 1065 1070 

gta gca aat ctg aaa gta att agt get tac cac gat aat gtg aat gtg 3264 
Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val Asn Val 
1075 . 1080 1085 

gat caa gga tta act tat ttt ate ggt ate gac caa gca get ccg ggt 3312 
Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala Pro Glv 
1090 1095 noo 

acg tat tac tgg cgt agt gtt gat cac age aaa tgt gaa aat ggc aag • 3360 
Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 
1105 1110 1115 H20 

ttt gcc get aat get tgg ggt gag tgg aat aaa att acc tgt get gtc 3408 
Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val 
1125 H30 H35 

aat cct tgg aaa aat ate ate cgt ccg gtt gtt tat atg tec cgc tta 3456 
Asn Pro Trp Lys Asn He He Arg Pro Val Val Tyr Met Ser Arg Leu 
1140 1145 H50 

tat ctg eta tgg ctg gag cag caa tea aag aaa agt gat gat ggt aaa 3504 
Tyr Leu Leu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 
II 55 H60 H65 

t^ 9 a " tat Caa tat aac tta aaa ct 9 9 ct cat a tt cgt tac gac 3552 
Thr Thr He Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg Tyr Asp 
1110 1175 H80 
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3600 

1185 1190 "1195 1200 



ggt agt tgg aat aca cca ttt act ttt gat gtg aca gaa aag gta aaa 
Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 

« i « r i n r\r\ i 1 qc lOHfl 



aat tac acg teg agt act gat get get gaa tct tta ggg ttg tat tgt 364 8 
Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys 
1205 1210 1215 

act ggt tat caa ggg gaa gac act eta tta gtt atg ttc tat teg atg 3696 
Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 
1220 1225 1230 

cag agt agt tat age tec tat ace gat aat aat gcg ccg gtc act ggg 374 4 
Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 
1235 1240 1245 

eta tat att ttc get gat atg tea tea gac aat atg acg aat gca caa 3792 
Leu Tyr He Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala Gin 
1250 1255 1260 

gca act aac tat tgg aat aac agt tat ccg caa ttt gat act gtg atg 3840 
Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 
1265 1270 1275 1280 

gca gat ccg gat age gac aat aaa aaa gtc ata ace aga aga gtt aat 3888 
Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg Val Asn 
1285 1290 1295 

aac cgt tat gcg gag gat tat gaa att cct tec tct gtg aca agt aac 3936 
Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr Ser Asn 
1300 1305 1310 

agt aat tat tct tgg ggt gat cac agt tta ace atg ctt tat ggt ggt 3984 
Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 
1315 1320 1325 

agt gtt cct aat att act ttt gaa teg gcg gca gaa gat tta agg eta 4032 
Ser Val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 
1330 1335 1340 

tct ace aat atg gca ttg agt att att cat aat gga tat gcg gga ace 4080 
Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala Gly Thr 
1345 1350 1355 1360 

cgc cgt ata caa tgt aat ctt atg aaa caa tac get tea tta ggt gat 4128 
Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 
1365 1370 1375 

aaa ttt ata att tat gat tea tea ttt gat gat gca aac cgt ttt aat 4176 
Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 
1380 1385 1390 

ctg gtg cca ttg ttt aaa ttc gga aaa gac gag aac tea gat gat agt 4224 
Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser 
1395 1400 1405 

att tgt ata tat aat gaa aac cct tec tct gaa gat aag aag tgg tat 4272 
He Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 
1410 1415 1420 

ttt tct teg aaa gat gac aat aaa aca gcg gat tat aat ggt gga act 4320 
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Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr 
1425 1430 1435 - 1440 

caa tgt ata gat get gga acc agt aac aaa gat ttt tat tat aat etc 4368 
Gin Cys lie Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 
1445 1450 1455 

cag gag att gaa gta att agt gtt act ggt ggg tat tgg teg agt tat 4 416 
Gin Glu He Glu Val He Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr 
1460 1465 ** 1470 

aaa ata tec aac ccg att aat ate aat acg ggc att gat agt get aaa 4464 
Lys He Ser Asn Pro lie Asn He Asn Thr Gly He Asp Ser Ala Lys 
1475 1480 1485 



gta aaa gtc acc gta aaa gcg ggt ggt gac gat caa ate ttt act get 
Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe Thr Ala 
1490 1495 1500 



4512 



gat aat agt acc tat gtt cct cag caa ccg gca ccc agt ttt gag gag 4560 
Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 
1505 1510 1515 1520 

atg att tat cag ttc aat aac ctg aca ata gat tgt aag aat tta aat 4 608 
Met He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn Leu Asn 
1525 1530 1535 

ttc ate gac aat cag gca cat att gag att gat ttc acc get acg gca 4656 
Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala Thr Ala 
1540 1545 1550 

caa gat ggc cga ttc ttg ggt gca gaa act ttt att ate ccg gta act 4704 
Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro Val Thr 
1555 1560 1565 



aaa aaa gtt etc ggt act gag aac gtg att gcg tta tat age gaa aat 
Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn 
1570 1575 1580 



4752 



aac ggt gtt caa tat atg caa att ggc gca tat cgt acc cgt ttg aat 4 800 
Asn Gly Val Gin Tyr Met Gin He Gly Ala Tyr Arg Thr Arg Leu Asn 
1585 1590 1595 1600 

acg tta ttc get caa cag ttg gtt age cgt get aat cgt ggc att gat 4 84 8 
Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly He Asp 
1605 1610 1615 

gca gtg etc agt atg gaa act cag aat att cag gaa ccg caa tta gga 4896 
Ala Val Leu Ser Met Glu Thr Gin Asn lie Gin Glu Pro Gin Leu Gly 
1620 1625 1630 

gcg ggc aca tat gtg cag ctt gtg ttg gat aaa tat gat gag tct att 4 944 
Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser He 
1635 1640 1645 

cat ggc act aat aaa age ttt get att gaa tat gtt gat ata ttt aaa 4 992 
His Gly Thr Asn Lys Ser Phe Ala He Glu Tyr Val Asp He Phe Lys 
1650 1655 1660 



gag aac gat agt ttt gtg att tat caa gga gaa ctt age gaa aca agt 5040 
Glu Asn Asp Ser Phe Val He Tyr Gin Gly Glu Leu Ser Glu Thr Ser 
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1665 1670 1675 1680 

caa act gtt gtg aaa gtt ttc tta tec tat ttt ata gag gcg act gga 5088 
Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe lie Glu Ala Thr Gly 
1685 1690 1695 

aat aag aac cac tta tgg gta cgt get aaa tac caa aag gaa acg act 5136 
Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr 
1700 1705 1710 

gat aag ate ttg ttc gac cgt act gat gag aaa gat ccg cac ggt tgg 5184 
Asp Lys He Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 
1715 1720 1725 

ttt etc age gac gat cac aag ace ttt agt ggt etc tct tec gca cag 5232 
Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 
1730 1735 1740 

gca tta aag aac gac agt gaa ccg atg gat ttc tct ggc gee aat get 5280 
Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ala 
1745 1750 1755 1760 . 

etc tat ttc tgg gaa ctg ttc tat tac acg ccg atg atg atg get cat 5328 
Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 
1765 1770 1775 

cgt ttg ttg cag gaa cag aat ttt gat gcg gcg aac cat tgg ttc cgt 5376 
Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 
1780 1785 1790 

tat gtc tgg agt cca tec ggt tat ate gtt gat ggt aaa att get ate 5424 
Tyr Val Trp Ser Pro Ser Gly Tyr He Val Asp Gly Lys He Ala lie 
1795 1800 1805 

tac cac tgg aac gtg cga ccg ctg gaa gaa gac ace agt tgg aat gca 5472 
Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 
1810 1815 1820 

caa caa ctg gac tec ace gat cca gat get gta gee caa gat gat ccg 5520 
Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 
1825 1830 1835 1840 



atg cac tac aag gtg get ace ttt atg gcg acg ttg gat ctg eta atg 
Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Met 
1845 1850 1855 



5568 



gee cgt ggt gat get get tac cgc cag tta gag cgt gat acg ttg get 5616 
Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 
1860 1865 1870 



gaa get aaa atg tgg tat aca cag gcg ctt aat ctg ttg ggt gat gag 
Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 
1875 * 1880 1885 



5664 



cca caa gtg atg ctg agt acg act tgg get aat cca aca ttg ggt aat 5712 
Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 
1890 1895 1900 

get get tea aaa acc aca cag cag gtt cgt cag caa gtg ctt ace cag 5760 
Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 
1905 ' 1910 1915 1920 
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ttg cgt etc aat age agg gta aaa ace ccg ttg eta gga aca gee aat 
Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 
1925 1930 1935 

tec ctg ace get tta ttc ctg ccg cag gaa aat age aag etc aaa ggc 
Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys Glv 
1940 1945 1950 

tac tgg egg aca ctg gcg cag cgt atg ttt aat tta cgt cat aat ctg 
Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His Asn Leu 
1955 I960 1965 

teg att gac ggc cag ccg etc tec ttg ccg ctg tat get aaa ccg get 
Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 



1970 1975 



1980 



5808 



5856 



5904 



5952 



gat cca aaa get tta ctg agt gcg gcg gtt tea get tct caa ggg gga 6000 
Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Glv 
1985 1990 1995 2000 

gee gac ttg ccg aag gcg ccg ctg act att cac cgc ttc cct caa atg 6048 
Ala Asp Leu Pro Lys Ala Pro Leu Thr He His Arg Phe Pro Gin Met 
2005 2010 2015 

eta gaa ggg gca egg ggc ttg gtt aac cag ctt ata cag ttc ggt agt 6096 
Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu He Gin Phe Gly Ser 
2020 2025 2030 

tea eta ttg ggg tac agt gag cgt cag gat gcg gaa get atg agt caa 614 4 
Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 
2035 2040 2045 

eta ctg caa ace caa gee age gag tta ata ctg ace agt att cgt atg 6192 
Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He Arq Met 
2050 2055 2060 

cag gat aac caa ttg gca gag ctg gat teg gaa aaa acc gee ttg caa 6240 
Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 
2065 2070 2075 2080 

gtc tct tta get gga gtg caa caa egg ttt gac age tat age caa ctg 6288 
Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 
2085 2090 2095 

tat gag gag aac ate aac gca ggt gag cag cga gcg ctg gcg tta cgc 6336 
Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 
2100 2105 2110 

tea gaa tct get att gag tct cag gga gcg cag att tec cgt atg gca 638 4 
Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg Met Ala 
2115 2120 2125 

ggc gcg ggt gtt gat atg gca cca aat ate ttc ggc ctg get gat ggc 6432 
Gly Ala Gly Val Asp Met Ala Pro Asn He Phe Gly Leu Ala Asp Glv 
2130 2135 2140 

ggc atg cat tat ggt get att gee tat gee ate get gac ggt att gag 64 80 
Gly Met His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly He Glu 
2145 2150 2155 2160 
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ttg agt get tct gec aag atg gtt gat gcg gag aaa gtt get cag teg 
Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin Ser 
2165 2170 2175 

gaa ata tat cgc cgt cgc cgt caa gaa tgg aaa att cag cgt gac aac 
Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 
2180 2185 2190 

gca caa gcg gag att aac cag tta aac gcg caa ctg gaa tea ctg tct 
Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 
2195 2200 2205 

att cgc cgt gaa gec get gaa atg caa aaa gag tac ctg aaa acc cag 
He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin 
2210 " 2215 2220 

caa get cag gcg cag gca caa ctt act ttc tta aga age aaa ttc agt 
Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 
2225 2230 2235 2240 

aat caa gcg tta tat agt tgg tta cga ggg cgt ttg tea ggt att tat 
Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly He Tyr 
2245 2250 2255 



6528 



6576 



6624 



6672 



6720 



6768 



ttc cag ttc tat gac ttg gee gta tea cgt tgc ctg atg gca gag caa 
Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 
2260 2265 2270 

tec tat caa tgg gaa get aat gat aat tec att age ttt gtc aaa ccg 
Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He Ser Phe Val Lys Pro 
2275 2280 2285 

ggt gca tgg caa gga act tac gee ggc tta ttg tgt gga gaa get ttg 
Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 
2290 ~ 2295 2300 

ata caa aat ctg gca caa atg gaa gag gca tat ctg aaa tgg gaa tct 
He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 
2305 2310 2315 2320 

cgc get ttg gaa gta gaa cgc acg gtt tea ttg gca gtg gtt tat gat 
Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 
2325 2330 2335 

tea ctg gaa ggt aat gat cgt ttt aat tta gcg gaa caa ata cct gca 
Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin He Pro Ala 
2340 2345 2350 

tta ttg gat aag ggg gag gga aca gca gga act aaa gaa aat ggg tta 
Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 
2355 2360 2365 

tea ttg get aat get ate ctg tea get teg gtc aaa ttg tec gac ttg 
Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 
2370 2375 2380 

aaa ctg gga acg gat tat cca gac agt ate gtt ggt age aac aag gtt 
Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val Gly Ser Asn Lys Val 
2385 2390 2395 2400 

cgt cgt att aag caa ate agt gtt teg eta cct gca ttg gtt ggg cct 



6816 



6864 



6912 



6960 



7008 



7056 



7104 



7152 



7200 



7248 
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Arg Arg lie Lys Gin He Ser Val Ser Leu Pro Ala Leu Val Gly Pro 
2405 2410 2415 

tat cag gat gtt cag get atg etc age tat ggt ggc agt act caa ttg 7296 
Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 
2420 2425 2430 

ccg aaa ggt tgt tea gcg ttg get gtg tct cat ggt ace aat gat agt 734 4 
Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 
2435 2440 2445 

ggt cag ttc cag ttg gat ttc aat gac ggc aaa tac ctg cca ttt gaa 7392 
Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 
2450 2455 2460 

ggt att get ctt gat gat cag ggt aca ctg aat ctt caa ttt ccg aat 7440 
Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 
2465 2470 2475 2480 

get acc gac aag cag aaa gca ata ttg caa act atg age gat att att 74 88 
Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Met Ser Asp He He 
2485 2490 2495 

ttg cat att cgt tat acc ate cgt taa 7515 
Leu His He Arg Tyr Thr He Arg 
2500 



<210> 3 
<211> 7577 
<212> DNA 

<213> Artificial Sequence 

<220> 

<221> CDS 

<222> (3) . . (7553) 

<220> 

<223> Description of Artificial Sequence : hemicot tcdA 
<400> 3 

cc atg get aac gag tec gtc aag gag ate cca gac gtc etc aag tec 47 
Met Ala Asn Glu Ser Val Lys Glu He Pro Asp Val Leu Lys Ser 
15 10 ' 15 

caa tgc ggt ttc aac tgc etc act gac ate tec cac age tec ttc aac 95 
Gin Cys Gly Phe Asn Cys Leu Thr Asp He Ser His Ser Ser Phe Asn 
20 25 30 

gag ttc aga caa caa gtc tct gag cac etc tec tgg tec gag acc cat 14 3 
Glu Phe Arg Gin Gin Val Ser Glu His Leu Ser Trp Ser Glu Thr His 
35 40 45 

gac etc tac cat gac get cag caa get cag aag gac aac agg etc tac 191 
Asp Leu Tyr His Asp Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu Tyr 
50 55 60 

gag get agg ate etc aag agg get aac cca caa etc cag aac get gtc 239 
Glu Ala Arg lie Leu Lys Arg Ala Asn Pro Gin Leu Gin Asn Ala Val 
65 70 75 
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cac etc gec ate ttg get cca aac get gag ttg att ggt tac aac aac 287 

His Leu Ala lie Leu Ala Pro Asn Ala Glu Leu lie Gly Tyr Asn Asn 
80 85 90 95 

cag ttc tct ggc aga get age cag tac gtg get cct ggt aca gtc tec 335 

Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val Ala Pro Gly Thr Val Ser 
100 105 110 

tec atg ttc age cca gec get tac etc act gag ttg tac cgc gag get 383 

Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala 
115 120 125 

agg aac ctt cat get tct gac tec gtc tac tac ttg gac aca cgc aga 431 

Arg Asn Leu His Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg 
130 135 140 

cca gac etc aag age atg gec etc age caa cag aac atg gac att gag 479 

Pro Asp Leu Lys Ser Met Ala Leu Ser Gin Gin Asn Met Asp He Glu 

145 ~ 150 155 

ttg tec acc etc tec ttg age aac gag ctt etc ttg gag tec ate aag 

Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser He Lys 
160 165 170 175 

act gag age aag ttg gag aac tac acc aag gtc atg gag atg etc tec 

Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser 
180 185 190 

acc ttc aga cca age ggt gca act cca tac cat gat gec tac gag aac 623 

Thr Phe Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn 
195 200 205 

gtc agg gag gtc ate caa ctt caa gac cct ggt ctt gag caa etc aac 671 
Val Arg Glu Val He Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn 
210 215 220 

get tct cca gee att get ggt ttg atg cac cag gca tec ttg etc ggt 719 
Ala Ser Pro Ala He Ala Gly Leu Met His Gin Ala Ser Leu Leu Gly 

225 230 235 



ate aac gec tec ate tct cct gag ttg ttc aac ate ttg act gag gag 
He Asn Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu 
240 245 250 255 



aac ttg tct gat gag gag ctt tct caa ttc att ggc aag get tec aac 
Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn 
290 295 300 



527 



575 



767 



ate act gag ggc aac get gag gag ttg tac aag aag aac ttc ggc aac 815 

He Thr Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn 
260 265 270 

att gag cca gee tct ctt gca atg cct gag tac etc aag agg tac tac 863 

He Glu Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr 

275 280 285 



911 



ttc ggt caa cag gag tac age aac aac cag etc ate act cca gtt gtg 959 
Phe Gly Gin Gin Glu Tyr Ser Asn Asn Gin Leu He Thr Pro Val Val 
305 310 315 
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aac tec tct gat ggc act gtg aag gtc tac cgc ate aca cgt gag tac 1007 
Asn Ser Ser Asp Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr 
320 325 330 335 

acc aca aac gec tac caa atg gat gtt gag ttg ttc cca ttc ggt ggt 1055 
Thr Thr Asn Ala Tyr Gin Met Asp Val Glu Leu Phe Pro Phe Gly Gly 
340 345 350 

gag aac tac aga ctt gac tac aag ttc aag aac ttc tac aac gec tec 1103 
Glu Asn Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser 
355 360 365 

tac etc tec ate aag ttg aac gac aag agg gag ctt gtc agg act gag 1151 
Tyr Leu Ser He Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu 
370 , 375 380 

ggt get cct caa gtg aac att gag tac tct gec aac ate acc etc aac 1199 
Gly Ala Pro Gin Val Asn He Glu Tyr Ser Ala Asn He Thr Leu Asn 
385 390 395 

aca get gac ate tct caa cca ttc gag att ggt ttg acc aga gtc ctt 124 7 
Thr Ala Asp He Ser Gin Pro Phe Glu He Gly Leu Thr Arg Val Leu 
400 405 410 415 

ccc tct ggc tec tgg gee tac get gca gec aag ttc act gtt gag gag 1295 
Pro Ser Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu 
420 425 430 

tac aac cag tac tct ttc etc ttg aag etc aac aag gca att cgt etc 1343 
Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala He Arg Leu 
435 440 445 

age aga gec act gag ttg tct ccc acc ate ttg gag ggc att gtg agg 1391 
Ser Arg Ala Thr Glu Leu Ser Pro Thr He Leu Glu Gly He Val Arg 
450 455 460 

tct gtc aac ctt caa ctt gac ate aac act gat gtg ctt ggc aag gtc 1439 
Ser Val Asn Leu Gin Leu Asp He Asn Thr Asp Val Leu Gly Lys Val 
465 470 475 

ttc etc acc aag. tac tac atg caa cgc tac gec ate cat get gag act 1487 
Phe Leu Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr 
480 485 4 90 495 

gca etc ate etc tgc aac gca ccc ate tct caa cgc tec tac gac aac 1535 
Ala Leu He Leu Cys Asn Ala Pro He Ser Gin Arg Ser Tyr Asp Asn 
500 505 510 

cag cct tec cag ttc gac agg etc ttc aac act cct etc ttg aac ggc 1583 
Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly 
515 520 525 

cag tac ttc tec act ggt gat gag gag att gac etc aac tct ggc tec 1631 
Gin Tyr Phe Ser Thr Gly Asp Glu Glu He Asp Leu Asn Ser Gly Ser 
530 535 540 

aca ggt gac tgg aga aag acc ate ttg aag agg gec ttc aac att gat 1679 
Thr Gly Asp Trp Arg Lys Thr lie Leu Lys Arg Ala Phe Asn He Asp 
545 550 555 

gat gtc tct etc ttc cgt etc ttg aag ate aca gat cac gac aac aag 1727 
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Asp Val Ser Leu Phe Arg Leu Leu Lys He Thr Asp His Asp Asn Lys 

560 565 570 575 

gat ggc aag ate aag aac aac ttg aag aac ctt tec aac etc tac att 1775 

Asp Gly Lys He Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr He 
580 585 590 



ggc aag ttg ctt gca gac ate cac caa etc acc att gat gag ttg gac 
Gly Lys Leu Leu Ala Asp He His Gin Leu Thr He Asp Glu Leu Asp 
595 600 605 



caa cct ggt gat ggt gee atg act get gag aag ttc tgg gac tgg etc 
Gin Pro Gly Asp Gly Ala Met Thr Ala Glu Lys Phe Trp Asp Trp Leu 
720 725 730 735 



1823 



etc ttg etc att gca gtc ggt gag ggc aag acc aac etc tct gca ate 1871 
Leu Leu Leu He Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala He 
610 615 620 

tct gac aag cag ttg gca acc etc ate agg aag ttg aac acc ate acc 1919 
Ser Asp Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr 
625 630 635 

tec tgg ctt cac acc cag aag tgg tct gtc ttc caa etc ttc ate atg 1967 
Ser Trp Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe He Met 
640 645 650 655 

acc age acc tec tac aac aag acc etc act cct gag ate aag aac etc 2015 
Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu He Lys Asn Leu 
660 665 670 

ttg gac aca gtc tac cac ggt etc caa ggc ttc gac aag gac aag get 2063 
Leu Asp Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala 
675 680 685 

gac ttg ctt cat gtc atg' get ccc tac att gca gee acc etc caa etc 2111 
Asp Leu Leu His Val Met Ala Pro Tyr He Ala Ala Thr Leu Gin Leu 
690 695 700 

tec tct gag aac gtg get cac tct gtc ttg etc tgg get gac aag etc 2159 
Ser Ser Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu 
705 710 715 



2207 



aac acc aag tac aca cca ggc tec tct gag get gtt gag act caa gag 2255 

Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu 

740 745 750 

cac att gtg caa tac tgc cag get ctt gca cag ttg gag atg gtc tac 2303 

His He Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Met Val Tyr 

755 760 765 

cac tec act ggc ate aac gag aac get ttc aga etc ttc gtc acc aag 2351 

His Ser Thr Gly He Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys 

770 " 775 780 

cct gag atg ttc ggt get gee aca ggt get gca cct get cat gat get 2399 
Pro Glu Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala 

785 790 795 

etc tec etc ate atg ttg acc agg ttc get gac tgg gtc aac get ctt 2447 

Leu Ser Leu He Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu 
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800 



805 



810 



815 



ggt gag aag get tec tct gtc ttg get gee ttc gag gee aac tec etc 24 95 
Gly Glu Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu 
820 825 830 

act get gag caa ctt get gat gee atg aac ctt gat gee aac etc ttg 2543 
Thr Ala Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu 
835 840 845 



etc caa get tec att caa get cag aac cac caa cac etc cca cct gtc 
Leu Gin Ala Ser He Gin Ala Gin Asn His Gin His Leu Pro Pro Val 
850 855 860 



2591 



act cca gag aac get ttc tec tgc tgg acc tec ate aac ace ate etc 
Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr He Leu 
865 870 875 



2639 



caa tgg gtc aac gtg get cag caa etc aac gtg get cca caa ggt gtc 2687 
Gin Trp Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val 
880 885 890 895 

tct get ttg gtc ggt ctt gac tac ate cag tec atg aag gag aca cca 2735 
Ser Ala Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro 
900 905 " 910 

acc tac get caa tgg gag aac gca get ggt gtc ttg act get ggt etc 2783 
Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu 
915 920 925 

aac tec caa cag gee aac acc etc cat get ttc ttg gat gag tct cgc 2831 
Asn Ser Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg 
930 935 940 

tct get gee etc tec acc tac tac ate agg caa gtc gee aag gca get 2879 
Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg Gin Val Ala Lys Ala Ala 
945 950 955 

get gec ate aag tct cgc gat gac etc tac caa tac etc etc att gac 2927 
Ala Ala He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp 
960 965 970 975 

aac cag gtc tct get gee ate aag acc acc agg ate get gag gec ate 2975 
Asn Gin Val Ser Ala Ala lie Lys Thr Thr Arg He Ala Glu Ala He 
980 985 990 



get tec ate caa etc tac gtc aac cgc get ctt gag aac gtt gag gag 
Ala Ser He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu 
995 1000 1005 



3023 



aac gee aac tct ggt gtc ate tct cgc caa ttc ttc ate gac tgg gac 3071 
Asn Ala Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp 
1010 1015 1020 

aag tac aac aag agg tac tec acc tgg get ggt gtc tct caa ctt gtc 3119 
Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val 
1025 1030 1035 



tac tac cca gag aac tac att gac cca acc atg agg att ggt cag acc 
Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr Met Arg He Gly Gin Thr 
1040 1045 1050 J 1055 



3167 
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aag atg atg gat get etc ttg caa tct gtc tec caa age caa etc aac 3215 
Lys Met Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn 
1060 1065 1070 

get gac act gtg gag gat gec ttc atg age tac etc ace tec ttc gag 
Ala Asp Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu 
1075 1080 1085 

caa gtt gee aac etc aag gtc ate tct get tac cat gac aac ate aac 
Gin Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn He Asn 
1090 1095 HOO 

aac gac caa ggt etc acc tac ttc att ggt etc tct gag act gat get 
Asn Asp Gin Gly Leu Thr Tyr Phe He Gly Leu Ser Glu Thr Asp Ala 
1105 1110 1H5 

.ggt gag tac tac tgg aga tec gtg gac cac age aag ttc aac gat ggc 
Gly Glu Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly 
1120 H25 1130 H35 

aag ttc get gca aac get tgg tct gag tgg cac aag att gac tgc cct 34 55 
Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp His Lys He Asp Cys Pro 
1140 1145 H50 



ate aac cca tac aag tec acc ate aga cct gtc ate tac aag age cgc 
He Asn Pro Tyr Lys Ser Thr He Arg Pro Val He Tyr Lys Ser Arg 
1155 H60 H65 



aag ttg get cac ate cgc tac gat ggt acc tgg aac act cca ate acc 

Lys Leu Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro He Thr 

1200 1205 1210 1215 

ttc gat gtc aac aag aag ate age gag ttg aag ttg gag aag aac cgt 

Phe Asp Val Asn Lys Lys He Ser Glu Leu Lys Leu Glu Lys Asn Arg 

1220 1225 1230 



gtc atg ttc tac aac cag caa gac acc ctt gac tec tac aag aac get 
Val Met Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala 
1250 ' 1255 1260 



3263 



3311 



3359 



3407 



3503 



etc tac ttg etc tgg ctt gag cag aag gag ate acc aag caa act ggc 3551 

Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu He Thr Lys Gin Thr Gly 
1170 1175 1180 

aac tec aag gat ggt tac caa act gag act gac tac cgc tac gag ttg 3599 

Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu 
1185 1190 1195 



3647 



3695 



get cct ggt etc tac tgc get ggt tac caa ggt gag gac acc etc ttg 374 3 
Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu 
1235 1240 1245 



3791 



tec atg caa ggt etc tac ate ttc get gac atg get tec aag gac atg 3839 

Ser Met Gin Gly Leu Tyr He Phe Ala Asp Met Ala Ser Lys Asp Met 

1265 ** 1270 1275 

act cca gag caa age aac gtc tac cgt gac aac tec tac caa cag ttc 3887 

Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe 
1280 1285 1290 1295 
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gac acc aac aac gtc agg cgt gtc aac aac aga tac get gag gac tac 3935 
Asp Thr Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr 
1300 1305 1310 

gag ate cca age tct gtc age tct cgc aag gac tac ggc tgg ggt gac 3983 
Glu lie Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp 
1315 1320 1325 

tac tac etc age atg gtg tac aac ggt gac ate cca acc ate aac tac 4031 
Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp He Pro Thr He Asn Tyr 
1330 1335 1340 

aag get gec tct tec gac etc aaa ate tac ate age cca aag etc agg 4079 
Lys Ala Ala Ser Ser Asp Leu Lys He Tyr He Ser Pro Lys Leu Arg 
134 5 1350 1355 

ate ate cac aac ggc tac gag ggt cag aag agg aac cag tgc aac ttg 4127 
He He His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu 
1360 1365 1370 1375 

atg aac aag tac ggc aag ttg ggt gac aag ttc att gtc tac acc tct 4175 
Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser 
1380 1385 1390 

ctt ggt gtc aac cca aac aac age tec aac aag etc atg ttc tac cca 4 223 
Leu Gly Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro 
1395 1400 1405 

gtc tac caa tac tct ggc aac acc tct ggt etc aac cag ggt aga etc 4 271 
Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu 
1410 1415 1420 

ttg ttc cac agg gac acc acc tac cca age aag gtg gag get tgg att 4319 
Leu Phe His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp He 
1425 1430 1435 

cct ggt gee aag agg tec etc acc aac cag aac get gee att ggt gat 4 367 
Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Gly Asp 
1440 1445 1450 1455 

gac tac gee aca gac tec etc aac aag cct gat gac etc aag cag tac 4 415 
Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr 
1460 1465 " 1470 

ate ttc atg act gac tec aag ggc aca gee act gat gtc .tct ggt cca 4 4 63 
He Phe Met Thr Asp Ser .Lys Gly Thr Ala Thr Asp Val Ser Gly Pro 
1475 1480 1485 

gtg gag ate aac act gca ate age cca gee aag gtc caa ate att gtc 4511 
Val Glu He Asn Thr Ala He Ser Pro Ala Lys Val Gin He He Val 
1490 1495 1500 

aag get ggt ggc aag gag caa acc ttc aca get gac aag gat gtc tec 
Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser 
1505 1510 1515 

ate cag cca age cca tec ttc gat gag atg aac tac caa ttc aac get 
He Gin Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gin Phe Asn Ala 
I 520 1525 1530 1535 

ctt gag att gat ggt tct ggc etc aac ttc ate aac aac tct get tec 4 655 



4559 



4607 
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Leu Glu He Asp Gly Ser Gly Leu Asn Phe He Asn Asn Ser Ala Ser 
1540 1545 1550 

att gat gtc acc ttc act gcc ttc get gag gat ggc cgc aag ttg ggt 4703 

He Asp Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly 
1555 1560 1565 

tac gag age ttc tec ate cca gtc acc ctt aag gtt tec act gac aac 4751 

Tyr Glu Ser Phe Ser He Pro Val Thr Leu Lys Val Ser Thr Asp Asn 
1570 1575 1580 

gca etc acc ctt cat cac aac gag aac ggt get cag tac atg caa tgg 47 99 

Ala Leu Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp 
1585 1590 1595 

caa age tac cgc acc agg ttg aac acc etc ttc gca agg caa ctt gtg 4847 

Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val 
1600 " 1605 1610 1615 

gcc cgt gcc acc aca ggc att gac acc ate etc age atg gag acc cag 4 895 

Ala Arg Ala Thr Thr Gly He Asp Thr He Leu Ser Met Glu Thr Gin 
1620 1625 1630 

aac ate caa gag cca cag ttg ggc aag ggt ttc tac gcc acc ttc gtc 4943 

Asn He Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val 
1635 1640 1645 

ate cca cct tac aac etc age act cat ggt gat gag agg tgg ttc aag 4991 

He Pro Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys 
1650 1655 1660 

etc tac ate aag cac gtg gtt gac aac aac tec cac ate ate tac tct 5039 

Leu Tyr He Lys His Val Val Asp Asn Asn Ser His He He Tyr Ser 
1665 1670 1675 

ggt caa etc act gac acc aac ate aac ate acc etc ttc ate cca ctt 5087 

Gly Gin Leu Thr Asp Thr Asn He Asn He Thr Leu Phe He Pro Leu 
1680 1685 1690 1695 

gac gat gtc cca etc aac cag gac tac cat gcc aag gtc tac atg acc 5135 

Asp Asp Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr 
1700 1705 1710 

ttc aag aag tct cca tct gat ggc acc tgg tgg ggt cca cac ttc gtc 5183 

Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val 
1715 1720 1725 

cgt gat gac aag ggc ate gtc acc ate aac cca aag tec ate etc acc 5231 

Arg Asp Asp Lys Gly He Val Thr He Asn Pro Lys Ser He Leu Thr 
1730 " 1735 1740 

cac ttc gag tct gtc aac gtt etc aac aac ate tec tct gag cca atg 5279 

His Phe Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Met 
1745 1750 1755 

gac ttc tct ggt gcc aac tec etc tac ttc tgg gag ttg ttc tac tac 5327 

Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr 
1760 1765 . 1770 1775 

aca cca atg ctt gtg get caa agg ttg etc cat gag cag aac ttc gat 5375 

Thr Pro Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp 
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1780 1785 1790 

gag gcc aac agg tgg etc aag tac gtc tgg age cca tct ggt tac att 5423 
Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He 
1795 1800 1805 

gtg cat ggt caa ate cag aac tac caa tgg aac gtc agg cca ttg ctt 54 71 
Val His Gly Gin He Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu 
1810 1815 1820 

gag gac acc tec tgg aac tct gac cca ctt gac tct gtg gac cct gat 5519 
Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp 
1825 1830 1835 

get gtg get caa cat gac cca atg cac tac aag gtc tec acc ttc atg 5567 
Ala Val Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met 
1840 1845 1850 1855 

agg acc ttg gac etc ttg att gcc aga ggt gac cat get tac cgc caa 5615 
Arg Thr Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin 
I860 1865 1870 

ttg gag agg gac acc etc aac gag gca aag atg tgg tac atg caa get 5663 
Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala 
1875 1880 1885 

etc cac etc ttg ggt gac aag cca tac etc cca etc age acc act tgg 5711 
Leu His Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp 
1890 1895 1900 

tec gac cca agg ttg gac cgt get get gac ate acc act cag aac get 5759 
Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp He Thr Thr Gin Asn Ala 
1905 1910 1915 

cat gac tct gcc att gtt get etc agg cag aac ate cca act cct get 5807 
Has Asp Ser Ala He Val Ala Leu Arg Gin Asn He Pro Thr Pro Ala 
1920 1925 1930 1935 

cca etc tec etc aga tct get aac acc etc act gac ttg ttc etc cca 5855 
Pro Leu Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro 
1940 1945 1950 

cag ate aac gag gtc atg atg aac tac tgg caa acc ttg get* caa agg 5903 
Gin He Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg 
1955 i960 1965 

gtc tac aac etc aga cac aac etc tec att gat ggt caa cca etc tac 5951 
Val Tyr Asn Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr 
1970 1975 1980 

etc cca ate tac gcc aca cca get gac cca aag get ctt etc tct get 5999 
Leu Pro He Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala 
1985 1990 1995 

get gtg get acc age caa ggt ggt ggc aag etc cca gag tec ttc atg 6047 
Ala Val Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met 
20 °0 2005 2010 2015 

tec etc tgg agg ttc cca cac atg ttg gag aac gcc cgt ggc atg gtc 6095 
Ser Leu Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val 
2020 2025 2030 
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tec caa etc ace cag ttc ggt tec ace etc cag aac ate att gag agg 6143 
Ser Gin Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He He Glu Arg 
2035 2040 2045 

caa gat get gag get etc aac get ttg etc cag aac cag gca get gag 6191 
Gin Asp Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu 
2050 2055 2060 

ttg ate etc acc aac ttg tec ate caa gac aag ace att gag gag ctt 6239 
Leu He Leu Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu 
2065 2070 2075 

gat get gag aag aca gtc ctt gag aag age aag get ggt gee caa tct 6287 
Asp Ala Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser 
2080 2085 2090 2095 

cgc ttc gac tec tac ggc aag etc tac gat gag aac ate aac get ggt 6335 
Arg Phe Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly 
2100 2105 2110 

gag aac cag gee atg acc etc agg get tec gca get ggt etc acc act 6383 
Glu Asn Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr 
2115 2120 2125 

get gtc caa gee tct cgc ttg get ggt gca get get gac etc gtt cca 6431 
Ala Val Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro 
2130 2135 2140 

aac ate ttc ggt ttc get ggt ggt ggc tec aga tgg ggt gee att get 647 9 
Asn He Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala 
2145 2150 2155 

gag get acc ggt tac gtc atg gag ttc tct gee aac gtc atg aac act 6527 
Glu Ala Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr 
2160 2165 2170 2175 

gag get gac aag ate age caa tct gag acc tac aga agg cgc cgt caa 6575 
Glu Ala Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin 
2180 2185 2190 

gag tgg gag ate caa agg aac aac get gag gca gag ttg aag caa ate 6623 
Glu Trp Glu He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He 
2195 2200 2205 

gat get caa etc aag tec ttg get gtc aga agg gag get get gtc etc 6671 
Asp Ala Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu 
2210 ' 2215 2220 

cag aag acc tec etc aag acc caa cag gag caa acc cag tec cag ttg 6719 
Gin Lys Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu 
2225 2230 2235 

get ttc etc caa agg aag ttc tec aac cag get etc tac aac tgg etc 6767 
Ala Phe Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu 
2240 2245 2250 2255 

aga ggc cgc ttg get gee ate tac ttc caa ttc tac gac ctt get gtg 6815 
Arg Gly Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val 
2260 2265 2270 
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gcc agg tgc etc atg get gag caa gee tac cgc tgg gag ttg aac gat 
Ala Arg Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp 
2275 2280 2285 



6863 



gac tec gcc agg ttc ate aag cca ggt get tgg caa ggc acc tac get 
Asp Ser Ala Arg Phe lie Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala 
2290 2295 2300 



6911 



ggt etc ctt get ggt gag acc etc atg etc tec ttg get caa atg gag 6959 
Gly Leu Leu Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu 
2305 2310 2315 

gat get cac etc aag agg gac aag agg get ttg gag gtg gag agg aca 7007 
Asp Ala His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr 
2320 2325 2330 2335 

gtc tec ctt get gag gtc tac get ggt etc cca aag gac aac ggt cca 7055 
Val Ser Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro 
2340 2345 2350 

ttc tec ctt get caa gag att gac aag ttg gtc age caa ggt tct ggt 7103 
Phe Ser Leu Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly 
2355 2360 2365 

tct get ggt tct ggt aac aac aac ttg get ttc ggc get ggt act gac 7151 
Ser Ala Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp 
2370 2375 2380 

acc aag acc tec etc caa gcc tct gtc tec ttc get gac etc aag ate 7199 
Thr Lys Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He 
2385 2390 2395 

agg gag gac tac cca get tec ctt ggc aag ate agg cgc ate aag caa 7247 
Arg Glu Asp Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin 
2400 2405 2410 " 2415 

ate tct gtc acc etc cca get etc ttg ggt cca tac caa gat gtc caa 7295 
He Ser Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin 
2420 2425 2430 

gca ate etc tec tac ggt gac aag get ggt ttg gcg aac ggt tgc gag 734 3 
Ala He Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu 
2435 2440 2445 

get ctt get gtc tct cat ggc atg aac gac tct ggt caa ttc caa ctt 7391 
Ala Leu Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu 
2450 2455 2460 

gac ttc aac gat ggc aag ttc etc cca ttc gag ggc att gcc att gac 7439 
Asp Phe Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp 
2465 2470 2475 

caa ggc acc etc acc etc tec ttc cca aac get tec atg cca gag aag 74 87 
Gin Gly Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys 
2480 2485 2490 2495 



gga aag caa gcc acc atg etc aag acc etc aac gat ate ate etc cac 

Gly Lys Gin Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His 
2500 2505 2510 

ate agg tac acc ate aag tgagctcgag aggectgegg cege 



7535 



7577 
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lie Arg Tyr Thr lie Lys 
2515 



<210> 4 
<211> 7541 
<212> DNA 

<213> Artificial Sequence 

<220> 

<221> CDS 

<222> (3) . . (7517) 

<220> 

<223> Description of Artificial Sequence : hemicot tcbA 
<400> 4 

cc atg get cag aac tec etc age tec acc att gac acc ate tgc cag 
Met Ala Gin Asn Ser Leu Ser Ser Thr He Asp Thr He Cys Gin 
15 10 15 

aag ctt caa etc acc tgc cca get gag ate gee etc tac cca ttc gac 
Lys Leu Gin Leu Thr Cys Pro Ala Glu lie Ala Leu Tyr Pro Phe Asp 
20 25 30 

acc ttc cgt gag aag acc aga ggc atg gtc aac tgg ggt gag gee aag 
Thr Phe Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys 
35 40 45 

agg ate tac gag att get caa get gag caa gac agg aac etc ctt cat 
Arg He Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His 
50 55 60 

gag aag agg ate ttc gee tac get aac cca ttg etc aag aac get gtc 
Glu Lys Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val 
65 70 75 

agg ctt ggt acc agg caa atg ttg ggt ttc ate caa ggt tac tct gac 
Arg Leu Gly Thr Arg Gin Met Leu Gly Phe lie Gin Gly Tyr Ser Asp 
80 85 90 95 

ttg ttc ggc aac agg get gac aac tac gca get cct ggt tct gtt get 
Leu Phe Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala 
100 105 HO 



47 



95 



143 



191 



239 



287 



335 



age atg ttc age cca get gee tac etc act gag ttg tac cgt gag gee 383 

Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala 
115 120 125 

aag aac etc cat gac age tec age ate tac tac ctt gac aag agg cgc 4 31 

Lys Asn Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg 
130 135 140 

cca gac ctt get tec ttg atg etc tec cag aag aac atg gat gag gag 479 

Pro Asp Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu 
145 150 155 

ate age acc ttg get etc tec aac gag ctt tgc ttg get ggc att gag 527 

He Ser Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu 

160 165 170 175 
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acc aag act ggc aag tec caa gat gag gtc atg gac atg etc tec ace 575 
Thr Lys Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr 
180 185 190 

tac cgc etc tct ggt gag act cca tac cac cat get tac gag act gtc 623 
Tyr Arg Leu Ser" Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val 
195 200 205 

agg gag att gtc cat gag agg gac cca ggt ttc cgc cac etc tec caa 671 
Arg Glu He Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin 
210 215 220 

get ccc att gtg get gee aag ttg gac cca gtc acc etc ttg ggc ate 719 
Ala Pro He Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly He 
225 230 235 

tec age cac ate age cca gag ttg tac aac ctt etc att gag gag ate 767 
Ser Ser His He Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu He 
240 245 250 255 

cca gag aag gat gag gca get ttg gac acc etc tac aag acc aac ttc 815 
Pro Glu Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe 
260 265 270 

ggt gac ate acc act get caa etc atg age cca tec tac ttg gee agg 8 63 
Gly Asp He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg 
275 280 285 

tac tac ggt gtc tct cca gag gac att get tac gtc acc aca age etc 911 
Tyr Tyr Gly Val Ser Pro Glu Asp lie Ala Tyr Val Thr Thr Ser Leu 
290 295 300 

tec cat gtg ggt tac tec tct gac ate ctt gtc ate cca etc gtg gat 959 
Ser His Val Gly Tyr Ser Ser Asp He Leu Val He Pro Leu Val Asp 
305 310 315 

ggt gtg ggc aag atg gag gtt gtc agg gtc acc agg act cca tct gac 1007 
Gly Val Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp 
320 325 330 335 

aac tac acc tec cag acc aac tac att gag ttg tac cca caa ggt ggt 1055 
Asn Tyr Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly 
340 345 350 

gac aac tac etc ate aag tac aac etc tec aac tct ttc ggt ttg gat 1103 
Asp Asn Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp 
355 360 365 

gac ttc tac etc cag tac aag gat ggt tct get gac tgg act gag att 1151 
Asp Phe Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He 
370 375 380 

get cac aac cca tac cca gac atg gtc ate aac cag aag tac gag tec 1199 
Ala His Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser 
385 390 395 

caa gec acc ate aag aga tct gac tct gac aac ate etc tec att ggt 1247 
Gin Ala Thr He Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly 
400 405 410 415 

etc caa agg tgg cac tct ggt tec tac aac ttc get get gec aac ttc 1295 
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Leu Gin Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe 
420 425 430 

aag att gac caa tac tct cca aag get ttc etc ttg aag atg aac aag 1343 
Lys lie Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys 
435 440 445 

gec ate agg etc ttg aag gee act ggt etc tec ttc gec acc ctt gag 1391 
Ala He Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu 
450 455 460 

agg att gtg gac tct gtc aac tec acc aag tec ate act gtg gag gtc 1439 
Arg He Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val 
465 470 475 

etc aac aag gtc tac aga gtc aag ttc tac att gac cgc tac ggc ate 1487 
Leu Asn Lys Val Tyr Arg Val Lys Phe Tyr He Asp Arg Tyr Gly He 
480 485 490 495 

tct gag gag act get gec ate ctt gee aac ate aac ate tec cag caa 1535 
Ser Glu Glu Thr Ala Ala lie Leu Ala Asn He Asn He Ser Gin Gin 
500 505 510 

get gtc ggc aac cag etc tec caa ttc gag caa etc ttc aac cac cct 1583 
Ala Val Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro 
515 520 525 

cca etc aac ggc ate cgc tac gag ate age gag gac aac tec aag cac 1631 
Pro Leu Asn Gly He Arg Tyr Glu He Ser Glu Asp Asn Ser Lys His 
530 " 535 540 

etc cca aac cca gac etc aac etc aag cca gac tec act ggt gat gac 1679 
Leu Pro Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp 
545 550 555 

caa agg aag get gtc etc aag agg get ttc caa gtc aac get tct gag 1727 
Gin Arg Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu 
560 " 565 570 575 

ctt tac caa atg etc ttg ate act gac agg aag gag gat ggt gtc ate 1775 
Leu Tyr Gin Met Leu Leu He Thr Asp Arg Lys Glu Asp Gly Val He . 

580 585 590 



aag aac aac ttg gag aac etc tct gac etc tac ctt gtc tec etc ttg 
Lys Asn Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu 
595 600 605 



1823 



gec caa ate cac aac ttg acc att get gag ttg aac ate etc ttg gtc 1871 

Ala Gin He His Asn Leu Thr He Ala Glu Leu Asn He Leu Leu Val 

610 615 620 

ate tgc ggt tac ggt gac acc aac ate tac caa ate act gac gac aac 1919 

He Cys Gly Tyr Gly Asp Thr Asn He Tyr Gin He Thr Asp Asp Asn 
625 630 635 

ctt gee aag att gtg gag acc etc ttg tgg ate acc caa tgg etc aag 1967 

Leu Ala Lys He Val Glu Thr Leu Leu Trp He Thr Gin Trp Leu Lys 

640 645 650 655 

acc cag aag tgg act gtc aca gac etc ttc etc atg acc act gee acc 2015 

Thr Gin Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr 
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660 665 670 

tac tec acc act etc act cca gag att tec aac etc act gee acc etc 2063 
Tyr Ser Thr Thr Leu Thr Pro Glu lie Ser Asn Leu Thr Ala * Thr Leu 
67 5 680 685 

age tec acc etc cac ggc aag gag tec etc att ggt gag gac etc aag 2111 
Ser Ser Thr Leu His Gly Lys Glu Ser Leu He Gly Glu Asp Leu Lys 
690 695 700 

agg gca atg get cca tgc ttc acc tct get etc cac etc acc tec caa 
Arg Ala Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin 

^ ft C — . » a 



2159 



705 710 



715 



gag gtg get tac gac etc ctt etc tgg att gac caa ate caa cca get 2207 
Glu Val Ala Tyr Asp Leu Leu Leu Trp He Asp Gin He Gin Pro Ala 
720 725 730 735 

caa ate act gtg gat ggt ttc tgg gag gag gtc caa acc act cca acc 2255 
Gin He Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr 
7 40 745 750 

tec etc aag gtc ate acc ttc get caa gtc ttg get caa etc tec etc 2303 
Ser Leu Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu 
755 760 765 

ate tac aga agg att ggt etc tct gag act gag ttg tec etc att gtc 2351 
He Tyr Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val 
770 775 780 

acc caa tec age etc ttg gtc get ggc aag tec ate ctt gat cat ggt 2399 
Thr Gin Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly 
785 790 795 

etc ttg acc etc atg get ctt gag ggt ttc cac acc tgg gtc aac ggt 24 4 7 
Leu Leu Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly 
800 805 810 815 

ttg ggt caa cat get tec etc ate ttg get gca etc aag gat ggt get 24 95 
Leu Gly Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala 
820 825 ** 830 

etc acc gtc acc gat gtg get caa gec atg aac aag gag gag tec etc 2543 
Leu Thr Val Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu 
835 840 845 

ttg caa atg get gee aac cag gtg gag aag gac etc acc aag etc acc 2591 
Leu Gin Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr 
850 855 * 860 

tec tgg acc caa ate gat gee ate etc caa tgg etc caa atg tec tct 2639 
Ser Trp Thr Gin He Asp Ala He Leu Gin Trp Leu Gin Met Ser Ser 
865 870 875 

get ctt get gtc age cca ttg gac ctt get ggc atg atg get etc aag 2687 
Ala Leu Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys 
880 885 890 895 

tac ggc att gat cac aac tac get gee tgg caa gca get gee get gee 2735 
Tyr Gly He Asp His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala 
900 905 910 
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etc atg get gac cat gee aac cag get cag aag aag ttg gat gag ace 
Leu Met Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr 
915 920 925 



2783 



ttc tec aag get etc tgc aac tac tac ate aac gee gtg gtt gac tct 
Phe Ser Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val Asp Ser 
930 935 ' 940 



2831 



get gec ggt gtc agg gac agg aac ggt etc tac acc tac etc ttg att 
Ala Ala Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr. Tyr Leu Leu He 
945 950 955 



2879 



gac aac cag gtc tct get gat gtc ate acc tec aga att get gag gee 
Asp Asn Gin Val Ser Ala Asp Val He Thr Ser Arg He Ala Glu Ala 
960 965 970 975 



2927 



att get ggc ate caa etc tac gtc aac agg get etc aac agg gat gag 
He Ala Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu 
980 985 990 



2975 



ggt cag ttg get tct gat gtc tec acc agg caa ttc ttc acc gac tgg 
Gly Gin Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp 
995 1000 1005 



3023 



gag agg tac aac aag agg tac tec acc tgg get ggt gtc tct gag ttg 
Glu Arg Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu 
1010 1015 1020 



3071 



gtc tac tac cca gag aac tac gtg gac cca acc caa agg att ggt cag 
Val Tyr Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin 
1025 * 1030 1035 



3119 



acc aag atg atg gat get ttg etc caa tec ate aac cag tec caa etc 
Thr Lys Met Met Asp Ala Leu Leu Gin Ser lie Asn Gin Ser Gin Leu 
1040 1045 1050 1055 



3167 



aac get gac act gtg gag gat get ttc aag acc tac etc acc tec ttc 
Asn Ala Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe 
1060 1065 1070 



3215 



gag caa gtg gec aac etc aag gtc ate tct get tac cat gac aac gtc 
Glu Gin Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val 
1075 1080 1085 



3263 



aac gtg gac caa ggt etc acc tac ttc att ggc att gac caa gec get 
Asn Val Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala 
1090 " 1095 1100 



3311 



cct ggc acc tac tac tgg agg tct gtg gac cac tec aag tgc gag aac 
Pro Gly Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn 
1105 1110 1115 



3359 



ggc aag ttc get gec aac get tgg ggt gag tgg aac aag ate acc tgc 
Gly Lys Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys 
1120 1125 1130 1135 



3407 



get gtc aac cct tgg aag aac ate ate agg cca gtg gtc tac atg tec 
Ala Val Asn Pro Trp Lys Asn He He Arg Pro Val Val Tyr Met Ser 
1140 1145 1150 



3455 
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aga etc tac ttg etc tgg ctt gag caa cag tec aag aag tct gat gac 3503 
Arg Leu Tyr Leu Leu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp 
H55 1160 1165 



ggc aag aca act ate tac cag tac aac etc aag ttg get cac ate cgc 
Gly Lys Thr Thr He Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg 
H™ 1175 H80 



3551 



tac gat ggt tec tgg aac act cca ttc acc ttc gat gtc act gag aag 3599 
Tyr Asp Gly .Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys 
H85 H90 H95 

gtc aag aac tac acc tec age act gat gca get gag tec ctt ggt etc 3647 
Val Lys Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu 
1200 1205 1210 1215 

tac tgc act ggt tac caa ggt gag gac acc etc ttg gtc atg ttc tac 3695 
Tyr Cys Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr 
1220 1225 1230 

tec atg caa tec age tac tec age tac act gac aac aac get cca gtc 3743 
Ser Met Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val 
1235 1240 1245 

act ggt etc tac ate ttc get gac atg tec tct gac aac atg acc aac 3791 
Thr Gly Leu Tyr He Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn 
1250 1255 1260 

get caa gee acc aac tac tgg aac aac tec tac cca caa ttc gac act 3839 
Ala Gin Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr 
1265 1270 1275 

gtc atg get gac cca gac tct gac aac aag aag gtc ate acc agg cgt 3887 
Val Met Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arq 
1280 1285 1290 1295 

gtc aac aac cgc tac get gag gac tac gag ate cca age tct gtc acc 3935 
Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr 
1300 1305 1310 

tec aac age aac tac tec tgg ggt gac cac tec etc acc atg etc tac 3983 
Ser Asn Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr 
1315 1320 1325 

ggt ggc tct gtc cca aac ate acc ttc gag tct gca get gag gac etc 4031 
Gly Gly Ser Val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu 
1330 1335 1340 

agg etc tec acc aac atg get etc tec ate att cac aac ggt tac get 4079 
Arg Leu Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala 
1345 1350 1355 

ggc acc agg cgc ate caa tgc aac etc atg aag caa tac get tec ctt 4127 
Gly Thr Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu 
1360 1365 1370 - 1375 

ggt gac aag ttc att ate tac gac tec age ttc gat gac gec aac agg 4175 
Gly Asp Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg 
1380 1385 1390 

ttc aac ttg gtc cca etc ttc aag ttc ggc aag gat gag aac tct gat 4223 
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Phe Asn Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp 
1395 1400 1405 

gac tec ate tgc ate tac aac gag aac cca age tct gag gac aag aag 4271 
Asp Ser lie Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys 
1410 ^ " 1415 1420 

tgg tac ttc age tec aag gac gac aac aag act get gac tac aac ggt 4319 
Trp Tyr Phe Ser Ser Lys Asp Asp Asn Lys Thr Aia Asp Tyr Asn Gly 
1425 1430 1435 

ggc acc caa tgc att gat get ggc ace tec aac aag gac ttc tac tac 4367 
Gly Thr Gin Cys He Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr 
1440 1445 1450 1455 

aac etc caa gag att gag gtc ate tct gtc act ggt ggc tac tgg tec 4415 
Asn Leu Gin Glu He Glu Val He Ser Val Thr Gly Gly Tyr Trp Ser 
1460 1465 1470 

age tac aag ate age aac ccc ate aac ate aac act ggc att gac tct 4 4 63 
Ser Tyr Lys He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser 
1475 1480 1485 

gec aag gtc aag gtc act gtc aag get ggt ggc gat gac caa ate ttc 4511 
Ala Lys Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe 
1490 1495 1500 

act get gac aac tec acc tac gtc cca cag caa cct get cca tec ttc 4559 
Thr Ala Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe 
1505 1510 1515 

gag gag atg ate tac caa ttc aac aac etc acc att gac tgc aag aac 4607 
Glu Glu Met He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn 
1520 1525 1530 1535 

etc aac ttc att gac aac cag get cac att gag att gac ttc act gee 4655 
Leu Asn Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala 
1540 1545 1550 

aca get caa gat ggc cgc ttc ttg ggt get gag acc ttc ate att cca 4703 
Thr Ala Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro 
1555 1560 1565 

gtc acc aag aag gtc ctt ggc act gag aac gtc att get etc tac tct 4751 
Val Thr Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser 
1570 1575 1580 

gag aac aac ggt gtc cag tac atg caa att ggt get tac aga acc agg 4799 
Glu Asn Asn Gly Val Gin Tyr Met Gin He Gly Ala Tyr Arg Thr Arg 
1585 1590 1595 

etc aac acc etc ttc get caa cag ttg gtc tec cgt gee aac aga ggc 4847 
Leu Asn Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly 
1600 1605 1610 1615 

att gat get gtc etc age atg gag act cag aac ate caa gag cca caa 4895 
He Asp Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin 
1620 1625 1630 

ctt ggt get ggc acc tac gtc caa ctt gtc ttg gac aag tac gat gag 4943 
Leu Gly Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu 
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1635 



1640 



1645 



tec att cat ggc acc aac aag tec ttc gec att gag tac gtg gac ate 4 991 
Ser lie His Gly Thr Asn Lys Ser Phe Ala He Glu Tyr Val Asp He 
1650 1655 1660 

ttc aag gag aac gac tec ttc gtc ate tac caa ggt gag ttg .tct gag 5039 
Phe Lys Glu Asn Asp Ser Phe Val He Tyr Gin Gly Glu Leu Ser Glu 
1665 1670 1675 

acc tec caa act gtg gtc aag gtc ttc etc tec tac ttc att gag gee 5087 
Thr Ser Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe He Glu Ala 
1680 1685 1690 1695 

acc ggt aac aag aac cac etc tgg gtc agg gee aag tac cag aag gag 5135 
Thr Gly Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu 
1700 1705 ** ' 1710 

acc act gac aag ate etc ttc gac agg act gat gag aag gac cca cat 5183 
Thr Thr Asp Lys He Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His 
1715 1720 1725 

ggt tgg ttc etc tct gat gac cac aag acc ttc tct ggt etc age tct 5231 
Gly Trp Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser 
1730 1735 1740 

get caa get etc aag aac gac tct gag cca atg gac ttc tct ggt gee 5279 
Ala Gin Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala 
1745 1750 1755 

aac get etc tac ttc tgg gag ttg ttc tac tac act cca atg atg atg 5327 
Asn Ala Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met 
1760 1765 1770 1775 

get cac agg etc ctt caa gag cag aac ttc gat get gee aac cac tgg 5375 
Ala His Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp 
1780 1785 1790 

ttc cgc tac gtc tgg age cca tct ggt tac att gtg gat ggc aag att 54 23 
Phe Arg Tyr Val Trp Ser Pro Ser Gly Tyr He Val Asp Gly Lys He 
1795 1800 1805 

gec ate tac cac tgg aac gtc agg cca ttg gag gag gac acc tec tgg 5471 
Ala He Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser- Trp 
1810 1815 1820 

aac get cag caa ctt gac tec act gac cca gat get gtg get caa gat 5519 
Asn Ala Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp 
1825 1830 1835 



gac cca atg cac tac aag gtg gee acc ttc atg gec acc ttg gac ctt 5567 
Asp Pro Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu 
i840 1845 1850 1855 

etc atg gee aga ggt gat get gec tac cgc caa ttg gag agg gac acc 5615 
Leu Met Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr 
I860 1865 1870 

ttg get gag gee aag atg tgg tac acc caa get etc aac ttg ctg ggt 5663 
Leu Ala Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly 
1875 1880 1885 
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gat gag cca caa gtc atg etc tec aca acc tgg gec aac cca acc ttg 

Asp Glu Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu 

1890 1895 1900 

ggc aac get gec tec aag acc aca caa cag gtc agg caa cag gtc etc 

Gly Asn Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu 

1905 1910 1915 

acc caa etc agg etc aac tct aga gtc aag act cca etc ttg ggc act 

Thr Gin Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr 

1920 1925 1930 1935 

gec aac tec etc act get etc ttc etc cca caa gag aac tec aaa ctt 

Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu 

1940 1945 1950 

aag ggt tac tgg agg acc ctt get caa cgc atg ttc aac etc agg cac 

Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His 

1955 i960 1965 

aac etc tec att gat ggt caa cca etc tec ttg cca etc tac get aag 
Asn Leu Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys 

1970 1975 1980 



5711 



5759 



5807 



5855 



5903 



5951 



cca get gac cca aag get etc ctt tec get get gtc tec gca tec caa 

Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin 
1985 1990 1995 

ggt ggt get gac etc cca aag get cca etc acc ate cac agg ttc cca 

Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr He His Arg Phe Pro 

2000 2005 2010 2015 



5999 



6047 



caa atg ttg gag ggt gec cgt ggt ctt gtc aac cag etc ate caa ttc 
Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu He Gin Phe 
2020 2025 2030 



6095 



ggt tec tct etc ctt ggt tac tct gag agg caa gat get gag gee atg 
Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met 
2035 2040 2045 



6143 



tec caa etc ttg caa acc cag get tct gag ttg ate etc acc tec ate 
Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He 
2050 2055 2060 



6191 



agg atg caa gac aac cag ctt get gag ttg gac tct gag aag act get 
Arg Met Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala 
2065 2070 2075 



6239 



etc caa gtc tec ctt get ggt gtc caa cag agg ttc gac age tac tec 
Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser 
2080 2085 2090 2095 



6287 



caa etc tac gag gag aac ate aac get ggt gag caa agg get ttg get 
Gin Leu Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg Ala Leu Ala 
2100 2105 2110 



6335 



etc agg tct gag tct gec att gag tec caa ggt get caa ate tec cgc 
Leu Arg Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg 
2115 2120 2125 



6383 
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atg get ggt get ggc gtg gac atg get cca aac ate ttc ggt ctt get 6431 
Met Ala Gly Ala Gly Val Asp Met Ala Pro Asn lie Phe Gly Leu Ala 
2130 2135 2140 

gat ggt ggc atg cac tac ggt gee att get tac gee att get gat ggc 647 9 
Asp Gly Gly Met His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly 
2145 2150 2155 

att gag ctt tct get tct gee aag atg gtt gat get gag aag gtg get 6527 
lie Glu Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala 
2160 2165 2170 " 2175 

caa tct gaa ate tac cgt cgc aga cgc caa gaa tgg aag ate caa agg 6575 
Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg 
2180 . 2185 * 2190 

gac aac get caa get gag ate aac cag etc aac get caa ctt gag tec 6623 
Asp Asn Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser 
2195 2200 2205 

etc age ate agg cgt gag get get gag atg cag aag gag tac etc aag 6671 
Leu Ser He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys 
2210 2215 2220 

ace caa cag get caa get cag get caa etc ace ttc etc agg tec aag 6719 
Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys 
2225 2230 2235 

ttc tec aac cag get etc tac tec tgg etc aga ggc cgc etc tct ggc 6767 
Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly 
2240 2245 2250 2255 



ate tac ttc caa ttc tac gac ttg get gtc tec cgc tgc etc atg get 
He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala 
2260 2265 ^ 2270 



6815 



gag caa tec tac caa tgg gag gee aac gac aac age ate tec ttc gtc 6863 
Glu Gin Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser lie Ser Phe Val 
2275 2280 2285 

aag cca ggt get tgg caa ggc ace tac get ggt etc ctt tgc ggt gag 6911 
Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu 
2290 2295 " 2300 

get etc ate cag aac ttg get caa atg gag gag get tac etc aag tgg 6959 
Ala Leu He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys TrD 
2305 2310 2315 

gag tec aga get ttg gag gta gag agg act gtc tec ctt get gta gtc 7007 
Glu Ser Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val 
2320 2325 2330 2335 

tac gac tec ttg gag ggc aac gac agg ttc aac ctt get gag caa ate 7055 
Tyr Asp Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin He 
2340 2345 2350 

cca get etc ttg gac aag ggt gag ggc act get ggc acc aag gag aac 7103 
Pro Ala Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn 
2355 2360 2365 

ggt etc tec ttg gee aac gee ate etc tct get tct gtc aag etc tct 7151 
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Gly Leu Ser Leu Ala Asn Ala lie Leu Ser Ala Ser Val Lys Leu Ser 
2370 2375 2380 

gac etc aag ttg ggt act gac tac cca gac tec att gtg ggt tec aac 7199 
Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val Gly Ser Asn 
2385 2390 2395 

aag gtc aga agg ate aag caa ate tct gtc tec etc cca get ttg gtg 7247 
Lys Val Arg Arg He Lys Gin He Ser Val Ser Leu Pro Ala Leu Val 
2400 2405 2410 2415 

ggt cca tac caa gat gtc caa gec atg etc tec tac ggt ggc tec acc 7295 
Gly Pro Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr 
2420 2425 2430 

caa etc cca aag ggt tgc tct get ttg get gtc tec cac ggc acc aac 7343 
Gin Leu Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn 
2435 2440 2445 

gac tct ggt caa ttc caa ctt gac ttc aac gat ggc aag tac etc cca 7391 
Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro 
2450 2455 2460 

ttc gaa ggc att get ttg gat gac caa ggc acc etc aac etc caa ttc 7439 
Phe Glu Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe 
2465 2470 > 2475 

cca aac gee act gac aag cag aag gee ate etc caa acc atg tct gac 7487 
Pro Asn Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Met Ser Asp 
2480 2485 2490 2495 

ate ate etc cac ate agg tac acc ate agg tgagctcgag aggectgegg 7537 
He He Leu His He Arg Tyr Thr He Arg 
2500 2505 



cege 



7541 



<210> 5 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: hemicot sequence 
encoding ER signal from 15 kDa zein from Black 
Mexican Sweet maize 

<220> 
<221> CDS 
<222> (1) . . (63) 

<400> 5 

atg get aag atg gtc att gtg ctt gtg gtc tgc ttg get etc tct get 48 

Met Ala Lys Met Val He Val Leu Val Val Cys Leu Ala Leu Ser Ala 

15 10 15 

gee tgt get tea gee 63 
Ala Cys Ala Ser Ala 
20 
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<210> 6 
<211> 7621 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : hemicot tcdA 
fused to the modified 15 kDa zein endoplasmic 
reticulum signal peptide 

<220> 

<221> CDS 

<222> (4) . . (7614) 

<400> 6 

ncc atg get aag atg gtc att gtg ctt gtg gtc tgc ttg get etc tct 48 

Met Ala Lys Met Val lie Val Leu Val Val Cys Leu Ala Leu Ser 

1 5 io 15 

get gee tgt get tea gee atg aac gag tec gtc aag gag ate cca gac 96 
Ala Ala Cys Ala Ser Ala Met Asn Glu Ser Val Lys Glu He Pro Asp 
20 25 ~ 30 

gtc etc aag tec caa tgc ggt ttc aac tgc etc act gac ate tec cac 144 
Val Leu Lys Ser Gin Cys Gly Phe Asn Cys Leu Thr Asp He Ser His 
35 40 45 

age tec ttc aac gag ttc aga caa caa gtc tct gag cac etc tec tgg 192 
Ser Ser Phe Asn Glu Phe Arg Gin Gin Val Ser Glu His Leu Ser Trp 
50 55 60 

tec gag acc cat gac etc tac cat gac get cag caa get cag aag gac 240 
Ser Glu Thr His Asp Leu Tyr His Asp Ala Gin Gin Ala Gin Lys Asp 
65 70 75 

aac agg etc tac gag get agg ate etc aag agg get aac cca caa etc 288 
Asn Arg Leu Tyr Glu Ala Arg He Leu Lys Arg Ala Asn Pro Gin Leu 
80 85 90 95 

cag aac get gtc cac etc gee ate ttg get cca aac get gag ttg att 336 
Gin Asn Ala Val His Leu Ala He Leu Ala Pro Asn Ala Glu Leu He 
100 105 HO 

ggt tac aac aac cag ttc tct ggc aga get age cag tac gtg get cct 384 
Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val Ala Pro 
115 120 125 

ggt aca gtc tec tec atg ttc age cca gee get tac etc act gag ttg 4 32 
Gly Thr Val Ser Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu 
130 135 140 

tac cgc gag get agg aac ctt cat get tct gac tec gtc tac tac ttg 480 
Tyr Arg Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val Tyr Tyr Leu 
145 150 155 

gac aca cgc aga cca gac etc aag age atg gee etc age caa cag aac 528 
Asp Thr Arg Arg Pro Asp Leu Lys Ser Met Ala Leu Ser Gin Gin Asn 
16 ° 165 170 175 

atg gac att gag ttg tec acc etc tec ttg age aac gag ctt etc ttg 576 
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Met Asp He Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu 
180 185 190 

gag tec ate aag act gag age aag ttg gag aac tac ace aag gtc atg 
Glu Ser He Lys Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys Val Met 
195 200 205 

gag atg etc tec ace ttc aga cca age ggt gca act cca tac cat gat 
Glu Met Leu Ser Thr Phe Arg Pro Ser Gly Ala Thr Pro Tyr His Asp 
210 215 220 

gec tac gag aac gtc agg gag gtc ate caa ctt caa gac cct ggt ctt 
Ala Tyr Glu Asn Val Arg Glu Val He Gin Leu Gin Asp Pro Gly Leu 
225 230 235 

gag caa etc aac get tct cca gec att get ggt ttg atg cac cag gca 
Glu Gin Leu Asn Ala Ser Pro Ala He Ala Gly Leu Met His Gin Ala 
240 245 250 255 

tec ttg etc ggt ate aac gee tec ate tct cct gag ttg ttc aac ate 
Ser Leu Leu Gly He Asn Ala Ser He Ser Pro Glu Leu Phe Asn He 
260 265 270 

ttg act gag gag ate act gag ggc aac get gag gag ttg tac aag aag 
Leu Thr Glu Glu He Thr Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys 
275 280 285 

aac ttc ggc aac att gag cca gee tct ctt gca atg cct gag tac etc 
Asn Phe Gly Asn He Glu Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu 
290 295 300 

aag agg tac tac aac ttg tct gat gag gag ctt tct caa ttc att ggc 
Lys Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe He Gly 
305 ' 310 315 

aag get tec aac ttc ggt caa cag gag tac age aac aac cag etc ate 
Lys Ala Ser Asn Phe Gly Gin Gin Glu Tyr Ser Asn Asn Gin Leu He 
320 325 330 335 

act cca gtt gtg aac tec tct gat ggc act gtg aag gtc tac cgc ate 
Thr Pro Val Val Asn Ser Ser Asp Gly Thr Val Lys Val Tyr Arg He 
340 345 350 

aca cgt gag tac ace aca aac gee tac caa atg gat gtt gag ttg ttc 
Thr Arg Glu Tyr Thr Thr Asn Ala Tyr Gin Met Asp Val Glu Leu Phe 
355 360 365 



tac aac gec tec tac etc tec ate aag ttg aac gac aag agg gag ctt 
Tyr Asn Ala Ser Tyr Leu Ser He Lys Leu Asn Asp Lys Arg Glu Leu 
385 390 395 

gtc agg act gag ggt get cct caa gtg aac att gag tac tct gee aac 
Val Arg Thr Glu Gly Ala Pro Gin Val Asn He Glu Tyr Ser Ala Asn 
400 405 410 415 



624 



672 



720 



768 



816 



864 



912 



960 



1008 



1056 



1104 



cca ttc ggt ggt gag aac tac aga ctt gac tac aag ttc aag aac ttc 1152 
Pro Phe Gly Gly Glu Asn Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe 
370 375 380 



1200 



1248 



ate ace etc aac aca get gac ate tct caa cca ttc gag att ggt ttg 1296 
He Thr Leu Asn Thr Ala Asp lie Ser Gin Pro Phe Glu He Gly Leu 
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420 425 430 

acc aga gtc ctt ccc tct ggc tec tgg gec tac get gca gee aag ttc 134 4 
Thr Arg Val Leu Pro Ser Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe 
435 440 445 

act gtt gag gag tac aac cag tac tct ttc etc ttg aag etc aac aag 1392 
Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys 
450 455 460 

gca att cgt etc age aga gee act gag ttg tct ccc acc ate ttg gag 1440 
Ala He Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr He Leu Glu 
465 470 475 

ggc att gtg agg tct gtc aac ctt caa ctt gac ate aac act gat gtg 1488 
Gly He Val Arg Ser Val Asn Leu Gin Leu Asp He Asn Thr Asp Val 
480 485 490 495 

ctt ggc aag gtc ttc etc acc aag tac tac atg caa cgc tac gee ate 1536 
Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He 
500 505 510 

cat get gag act gca etc ate etc tgc aac gca ccc ate tct caa cgc 1584 
Has Ala Glu Thr Ala Leu He Leu Cys Asn Ala Pro He Ser Gin Arg 
515 520 525 

tec tac gac aac cag cct tec cag ttc gac agg etc ttc aac act cct 1632 
Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro 
530 535 540 

etc ttg aac ggc cag tac ttc tec act ggt gat gag gag att gac etc 1680 
Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu He Asp Leu 
545 ' 550 555 

aac tct ggc tec aca ggt gac tgg aga aag acc ate ttg aag agg gee 1728 
Asn Ser Gly Ser Thr Gly Asp Trp Arg Lys Thr He Leu Lys Arg Ala 
560 565 570 575 

ttc aac att gat gat gtc tct etc ttc cgt etc ttg aag ate aca gat 1776 
Phe Asn He Asp Asp Val Ser Leu Phe Arg Leu Leu Lys He Thr Asp 
580 585 590 

cac gac aac aag gat ggc aag ate aag aac aac ttg aag aac ctt tec 1824 
His Asp Asn Lys Asp Gly Lys He Lys Asn Asn Leu Lys Asn Leu Ser 
595 600 605 

aac etc tac att ggc aag ttg ctt gca gac ate cac caa etc acc att 1872 
Asn Leu Tyr He Gly Lys Leu Leu Ala Asp He His Gin Leu Thr He 
610 615 620 

gat gag ttg gac etc ttg etc att gca gtc ggt gag ggc aag acc aac 1920 
Asp Glu Leu Asp Leu Leu Leu He Ala Val Gly Glu Gly Lys Thr Asn 
6 25 630 635 

etc tct gca ate tct gac aag cag ttg gca acc etc ate agg aag ttg 1968 
Leu Ser Ala He Ser Asp Lys Gin Leu Ala Thr Leu He Arg Lys Leu 
640 645 650 655 

aac acc ate acc tec tgg ctt cac acc cag aag tgg tct gtc ttc caa 2016 
Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val Phe Gin 
660 665 670 
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etc ttc ate atg acc age ace tec tac aac aag ace etc act cct gag 2064 
Leu Phe lie Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu 
675 680 685 

ate aag aac etc ttg gac aca gtc tac cac ggt etc caa ggc ttc gac 2112 
He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly Phe Asp 
690 695 700 

aag gac aag get gac ttg ctt cat gtc atg get ccc tac att gca gec 
Lys Asp Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr He Ala Ala 
705 710 715 

acc etc caa etc tec tct gag aac gtg get cac tct gtc ttg etc tgg 2208 
Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu Leu Trp 
720 725 730 735 



get gac aag etc caa cct ggt gat ggt gee atg act get gag aag ttc 

Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Ala Glu Lys Phe 
740 745 750 

tgg gac tgg etc aac acc aag tac aca cca ggc tec tct gag get gtt 

Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val 
755 760 765 



gag atg gtc tac cac tec act ggc ate aac gag aac get ttc aga etc 
Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe Arg Leu 
785 ' 790 795 

ttc gtc acc aag cct gag atg ttc ggt get gee aca ggt get gca cct 
Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala Ala Pro 
800 805 810 815 

get cat gat get etc tec etc ate atg ttg acc agg ttc get gac tgg 
Ala His Asp Ala Leu Ser Leu He Met Leu Thr Arg Phe Ala Asp Trp 
820 825 830 

gtc aac get ctt ggt gag aag get tec tct gtc ttg get gec ttc gag 
Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Ala Phe Glu 
835 840 845 

gec aac tec etc act get gag caa ctt get gat gec atg aac ctt gat 
Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn Leu Asp 
850 855 860 

gec aac etc ttg etc caa get tec att caa get cag aac cac caa cac 
Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His Gin His 
865 870 875 



2160 



2256 



2304 



gag act caa gag cac att gtg caa tac tgc cag get ctt gca cag ttg 2352 
Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu 
770 775 780 



2400 



2448 



2496 



2544 



2592 



2640 



etc cca cct gtc act cca gag aac get ttc tec tgc tgg acc tec ate 2688 

Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr Ser He 

880 885 890 895 

aac acc ate etc caa tgg gtc aac gtg get cag caa etc aac gtg get 2736 

Asn Thr He Leu Gin -Trp Val Asn Val Ala Gin Gin Leu Asn Val Ala 

900 905 910 
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cca caa ggt gtc tct get ttg gtc ggt ctt gac tac ate cag tec atg 
Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr He Gin Ser Met 
915 920 925 



2784 



aag gag aca cca ace tac get caa tgg gag aac gca get ggt gtc ttg 
Lys Glu Thr Pro Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly Val Leu 
930 935 940 



2832 



act get ggt etc aac tec caa cag gee aac ace etc cat get ttc ttg 
Thr Ala Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His Ala Phe Leu 
945 950 955 



2880 



gat gag tct cgc tct get gee etc tec acc tac tac ate agg caa gtc 
Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg Gin Val 
960 965 970 975 



2928 



gec aag gca get get gee ate aag tct cgc gat gac etc tac caa tac 
Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr 
980 985 990 



2976 



etc etc att gac aac cag gtc tct get gee ate aag acc acc agg ate 
Leu Leu He Asp Asn Gin Val Ser Ala Ala lie Lys Thr Thr Arg He 
995 1000 1005 



3024 



get gag gee ate get tec ate caa etc tac gtc aac cgc get ctt gag 
Ala Glu Ala He Ala Ser He Gin Leu Tyr Val Asn Arg Ala Leu Glu 
1010 1015 1020 



3072 



aac gtt gag gag aac gee aac tct ggt gtc ate tct cgc caa ttc ttc 
Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin Phe Phe 
1025 1030 1035 



3120 



ate gac tgg gac aag tac aac aag agg tac tec acc tgg get ggt gtc 
He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val 
1040 1045 1050 1055 



3168 



tct caa ctt gtc tac tac cca gag aac tac att gac cca acc atg agg 
Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr Met Arg 
1060 1065 1070 



3216 



att ggt cag acc aag atg atg gat get etc ttg caa tct gtc tec caa 
lie Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val Ser Gin 
1075 1080 1085 



3264 



age caa etc aac get gac act gtg gag gat gee ttc atg age tac etc 
Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser Tyr Leu 
1090 1095 1100 



3312 



acc tec ttc gag caa gtt gee aac etc aag gtc ate tct get tac cat 
Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala Tyr His 
1105 1110 1H5 



3360 



gac aac ate aac aac gac caa ggt etc acc tac ttc att ggt etc tct 
Asp Asn He Asn Asn Asp Gin Gly Leu Thr Tyr Phe He Gly Leu Ser 
1120 1125 1130 " 1135 



3408 



gag act gat get ggt gag tac tac tgg aga tec gtg gac cac age aag 
Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His Ser Lys 
1140 1145 1150 



3456 



ttc aac gat ggc aag ttc get gca aac get tgg tct gag tgg cac aag 3504 
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Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp His Lys 
1155 1160 1165 

att gac tgc cct ate aac cca tac aag tec acc ate aga cct gtc ate 3552 

lie Asp Cys Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro Val He 
1170 1175 1180 

tac aag age cgc etc tac ttg etc tgg ctt gag cag aag gag ate acc 3600 

Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu He Thr 

1185 1190 1195 

aag caa act ggc aac tec aag gat ggt tac caa act gag act gac tac 3648 

Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr 
1200 1205 1210 1215 

cgc tac gag ttg aag ttg get cac ate cgc tac gat ggt acc tgg aac 3696 

Arg Tyr Glu Leu Lys Leu Ala His lie Arg Tyr Asp Gly Thr Trp Asn 
1220 1225 1230 

act cca ate acc ttc gat gtc aac aag aag ate age gag ttg aag ttg 3744 

Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Glu Leu Lys Leu 
1235 1240 1245 

gag aag aac cgt get cct ggt etc tac tgc get ggt tac caa ggt gag 37 92 

Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu 
1250 1255 1260 

gac acc etc ttg gtc atg ttc tac aac cag caa gac acc ctt gac tec 3840 

Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser 

1265 1270 1275 

tac aag aac get tec atg caa ggt etc tac ate ttc get gac atg get 3888 

Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr lie Phe Ala Asp Met Ala 
1280 1285 1290 * 1295 

tec aag gac atg act cca gag caa age aac gtc tac cgt gac aac tec 3936 

Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp Ash Ser 
1300 1305 1310 

tac caa cag ttc gac acc aac aac gtc agg cgt gtc aac aac aga tac 3984 

Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn Arg Tyr 
1315 ' 1320 1325 

get gag gac tac gag ate cca age tct gtc age tct cgc aag gac tac 4032 

Ala Glu Asp Tyr Glu lie Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr 
1330 1335 1340 

ggc tgg ggt gac tac tac etc age atg gtg tac aac ggt gac ate cca 4080 

Gly Trp Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp lie Pro 

1345 1350 1355 

acc ate aac tac aag get gee tct tec gac etc aaa ate tac ate age 4128 

Thr lie Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys lie Tyr lie Ser 
1360 1365 1370 1375 

cca aag etc agg ate ate cac aac ggc tac gag ggt cag aag agg aac 4176 

Pro Lys Leu Arg lie lie His Asn Gly Tyr Glu Gly Gin Lys Arg Asn 
1380 1385 1390 

cag tgc aac ttg atg aac aag tac ggc aag ttg ggt gac aag ttc att 4224 

Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys Phe lie 
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1395 1400 1405 

gtc tac acc tct ctt ggt gtc aac cca aac aac age tec aac aag etc 4272 
Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn Lys Leu 
1410 1415 1420 

atg ttc tac cca gtc tac caa tac tct ggc aac acc tct ggt etc aac 4 320 
Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn 
1425 1430 1435 

cag ggt aga etc ttg ttc cac agg gac acc acc tac cca age aag gtg 4 368 
Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser Lys Val 
1440 1445 1450 1455 

gag get tgg att cct ggt gec aag agg tec etc acc aac cag aac get 4 416 
Glu Ala Trp lie Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala 
1460 1465 1470 

gee att ggt gat gac tac gee aca gac tec etc aac aag cct gat gac 4 4 64 
Ala lie Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp 
1475 1480 1485 

etc aag cag tac ate ttc atg act gac tec aag ggc aca gee act gat 4512 
Leu Lys Gin Tyr He Phe Met Thr Asp Ser Lys Gly Thr Ala Thr Asp 
1490 1495 1500 

gtc tct ggt cca gtg gag ate aac act gca ate age cca gee aag gtc 4 560 
Val Ser Gly Pro Val Glu He Asn Thr Ala He Ser Pro Ala Lys Val 
1505 1510 1515 

caa ate att gtc aag get ggt ggc aag gag caa acc ttc aca get gac 4 608 
Gin He He Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp 
1520 1525^ 1530 1535 

aag gat gtc tec ate cag cca age cca tec ttc gat gag atg aac tac 4 656 
Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr 
1540 1545 1550 

caa ttc aac get ctt gag att gat ggt tct ggc etc aac ttc ate aac 4704 
Gin Phe Asn Ala Leu Glu He Asp Gly Ser Gly Leu Asn Phe He Asn 
1555 1560 1565 

aac tct get tec att gat gtc acc ttc act gee ttc get gag gat ggc 4752 
Asn Ser Ala Ser He Asp Val Thr Phe Thr Ala Phe Ala Glu Asp Gly 
1570 1575 1580 

cgc aag ttg ggt tac gag age ttc tec ate cca gtc acc ctt aag gtt 4800 
Arg Lys Leu Gly Tyr Glu Ser Phe Ser He Pro Val Thr Leu Lys Val 
1585 1590 1595 

tec act gac aac gca etc acc ctt cat cac aac gag aac ggt get cag 4848 
Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly Ala Gin 
1600 1605 1610 1615 

tac atg caa tgg caa age tac cgc acc agg ttg aac acc etc ttc gca 4896 
Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala 
1620 1625 1630 

agg caa ctt gtg gec cgt gee acc aca ggc att gac acc ate etc age 4 94 4 
Arg Gin Leu Val Ala Arg Ala Thr Thr Gly He Asp Thr He Leu Ser 
1635 1640 1645 
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atg gag acc cag aac ate caa gag cca cag ttg ggc aag ggt ttc tac 4992 
Met Glu Thr Gin Asn lie Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr 
1650 1655 1660 

gee acc ttc gtc ate cca cct tac aac etc age act cat ggt gat gag 5040 
Ala Thr Phe Val lie Pro Pro Tyr Asn Leu Ser Thr His Gly Asp Glu 
1665 1670 1675 

agg tgg ttc aag etc tac ate aag cac gtg gtt gac aac aac tec cac 5088 
Arg Trp Phe Lys Leu Tyr lie Lys His Val Val Asp Asn Asn Ser His 
1680 1685 1690 1695 

ate ate tac tct ggt caa etc act gac acc aac ate aac ate acc etc 5136 
lie lie Tyr Ser Gly Gin Leu Thr Asp Thr Asn lie Asn lie Thr Leu 
1700 1705 1710 

ttc ate cca ctt gac gat gtc cca etc aac cag gac tac cat gee aag 5184 
Phe lie Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His Ala Lys 
1715 1720 1725 

gtc tac atg acc ttc aag aag tct cca tct gat ggc acc tgg tgg ggt 5232 
Val Tyr Met Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly 
1730 1735 1740 

cca -cac ttc gtc cgt gat gac aag ggc ate gtc acc ate aac cca aag 5280 
Pro His Phe Val Arg Asp Asp Lys Gly lie Val Thr lie Asn Pro Lys 
1745 1750 1755 

tec ate etc acc cac ttc gag tct gtc aac gtt etc aac aac ate tec 5328 
Ser lie Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn lie Ser 
1760 1765 1770 1775 

tct gag cca atg gac ttc tct ggt gee aac tec etc tac ttc tgg gag 5376 
Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu 
1780 1785 1790 

ttg ttc tac tac aca cca atg ctt gtg get caa agg ttg etc cat gag 5424 
Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu His Glu 
1795 1800 1805 

cag aac ttc gat gag gee aac agg tgg etc aag tac gtc tgg age cca 5472 
Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro 
1810 1815 ' 1820 

tct ggt tac att gtg cat. ggt caa ate cag aac tac caa tgg aac gtc 5520 
Ser Gly Tyr He Val His Gly Gin He Gin Asn Tyr Gin Trp Asn Val 
1825 1830 1835 

agg cca ttg ctt gag gac acc tec tgg aac tct gac cca ctt gac tct 5568 
Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser 
1840 1845 1850 1855 

gtg gac cct gat get gtg get caa cat gac cca atg cac tac aag gtc 5616 
Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tyr Lys Val 
1860 1865 1870 

tec acc ttc atg agg acc ttg gac etc ttg att gee aga ggt gac cat 5664 
Ser Thr Phe Met Arg Thr Leu Asp Leu Leu He Ala Arg Gly Asp His 
1875 1880 1885 
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get tac cgc caa ttg gag agg gac acc etc aac gag gca aag atg tgg 
Ala. Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys Met Trp 
1890 1895 1900 



5112 



tac atg caa get etc cac etc ttg ggt gac aag cca tac etc cca etc 
Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu 
1905 1910 1915 



5760 



age acc act tgg tec gac cca agg ttg gac cgt get get gac ate acc 
Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp He Thr 
1920 1925 1930 1935 



5808 



act cag aac get cat gac tct gee att gtt get etc agg cag aac ate 
Thr Gin Asn Ala His Asp Ser Ala He Val Ala Leu Arg Gin Asn He 
1940 1945 1950 



5856 



cca act cct get cca etc tec etc aga tct get aac acc etc act gac 
Pro Thr Pro Ala Pro Leu Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp 
1955 1960 1965 



5904 



ttg ttc etc cca cag ate aac gag gtc atg atg aac tac tgg caa acc 
Leu Phe Leu Pro Gin He Asn Glu Val Met Met Asn Tyr Trp Gin Thr 
1970 1975 1980 



5952 



ttg get caa agg gtc tac aac etc aga cac aac etc tec att gat ggt 
Leu Ala Gin Arg Val Tyr Asn Leu Arg His Asn Leu Ser He Asp Gly 
1985 1990 1995 



6000 



caa cca etc tac etc cca ate tac gee aca cca get gac cca aag get 
Gin Pro Leu Tyr Leu Pro He Tyr Ala Thr Pro Ala Asp Pro Lys Ala 
2000 2005 2010 2015 



6048 



ctt etc tct get get gtg get acc age caa ggt ggt ggc aag etc cca 
Leu Leu Ser Ala Ala Val Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro 
2020 2025 "* ~ 2030 



6096 



gag tec ttc atg tec etc tgg agg ttc cca cac atg ttg gag aac gee 
Glu Ser Phe Met Ser Leu Trp Arg Phe Pro His Met Leu Glu Asn Ala 
2035 2040 2045 



6144 



cgt ggc atg gtc tec caa etc acc cag ttc ggt tec acc etc cag aac 
Arg Gly Met Val Ser Gin Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn 
2050 2055 2060 



6192 



ate att gag agg caa gat get gag get etc aac get ttg etc cag aac 
He He Glu Arg Gin Asp Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn 
2065 2070 2075 



6240 



cag .gca get gag ttg ate etc acc aac ttg tec ate caa gac aag acc 6288 
Gin Ala Ala Glu Leu He Leu Thr Asn Leu Ser He Gin Asp Lys Thr 
2080 2085 2090 2095 

att gag gag ctt gat get gag aag aca gtc ctt gag aag age aag get 6336 
He Glu Glu Leu Asp Ala Glu Lys Thr Val Leu Glu Lys Ser Lys Ala 
2100 2105 2110 



ggt gee caa tct cgc ttc gac tec tac ggc aag etc tac gat gag aac 6384 
Gly Ala Gin Ser Arg Phe Asp Ser Tyr Gly ' Lys Leu Tyr Asp Glu Asn 
2115 2120 2125 

ate aac get ggt gag aac cag gee atg acc etc agg get tec gca get 64 32 
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He Asn Ala Gly Glu Asn Gin Ala Met Thr Leu Arg Ala Ser Ala Ala 

2130 2135 2140 

ggt etc acc act get gtc caa gec tct cgc ttg get ggt gca get get 

Gly Leu Thr Thr Ala Val Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala 

2145 2150 2155 

gac etc gtt cca aac ate ttc ggt ttc get ggt ggt ggc tec aga tgg 

Asp Leu Val Pro Asn lie Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp 

2160 2165 2170 2175 

ggt gee att get gag get acc ggt tac gtc atg gag ttc tct gee aac 

Gly Ala He Ala Glu Ala Thr Gly Tyr Val Met Glu Phe Ser Ala Asn 

2180 2185 2190 



gag ttg aac gat gac tec gee agg ttc ate aag cca ggt get tgg caa 
Glu Leu Asn Asp Asp Ser Ala Arg Phe He Lys Pro Gly Ala Trp Gin 
2305 2310 2315 



get caa atg gag gat get cac etc aag agg gac aag agg get ttg gag 
Ala Gin Met Glu Asp Ala His Leu Lys Arg Asp Lys Arg Ala Leu Glu 
2340 2345 2350 

gtg gag agg aca gtc tec ctt get gag gtc tac get ggt etc cca aag 
Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys 
2355 2360 2365 



6480 



6528 



6576 



gtc atg aac act gag get gac aag ate age caa tct gag acc tac aga 6624 
Val Met Asn Thr Glu Ala Asp Lys He Ser Gin Ser Glu Thr Tyr Arg 
2195 2200 2205 

agg cgc cgt caa gag tgg gag ate caa agg aac aac get gag gca gag 6672 
Arg Arg Arg Gin Glu Trp Glu He Gin Arg Asn Asn Ala Glu Ala Glu 
2210 2215 2220 

ttg aag caa ate gat get caa etc aag tec ttg get gtc aga agg gag 6720 
Leu Lys Gin He Asp Ala Gin Leu Lys Ser Leu Ala Val Arg Arg Glu 
2225 2230 2235 

get get gtc etc cag aag acc tec etc aag acc caa cag gag caa acc 
Ala Ala Val Leu Gin Lys Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr 
2240 2245 2250 2255 

cag tec cag ttg get ttc etc caa agg aag ttc tec aac cag get etc 
Gin Ser Gin Leu Ala Phe Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu 
2260 2265 2270 

tac aac tgg etc aga ggc cgc ttg get gee ate tac ttc caa ttc tac 6864 
Tyr Asn Trp Leu Arg Gly Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr 
2275 2280 2285 

gac ctt get gtg gee agg tgc etc atg get gag caa gee tac cgc tgg 6912 
Asp Leu Ala Val Ala Arg Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp 
2290 2295 2300 



6768 



6816 



6960 



ggc acc tac get ggt etc ctt get ggt gag acc etc atg etc tec ttg 7008 
Gly Thr Tyr Ala Gly Leu Leu Ala Gly Glu Thr Leu Met Leu Ser Leu 
2320 2325 2330 2335 



7056 



7104 



gac aac ggt cca ttc tec ctt get caa gag att gac aag ttg gtc age 7152 
Asp Asn Gly Pro Phe Ser Leu Ala Gin Glu He Asp Lys Leu Val Ser 
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2370 2375 2380 

caa ggt tct ggt tct get ggt tct ggt aac aac aac ttg get ttc ggc 7200 
Gin Gly Ser Gly Ser Ala Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly 
2385 2390 2395 

get ggt act gac ace aag ace tec etc caa gec tct gtc tec ttc get 724 8 
Ala Gly Thr Asp Thr Lys Thr Ser Leu Gin Ala Ser Val Ser Phe Ala 
2400 2405 2410 2415 

gac etc aag ate agg gag gac tac cca get tec ctt ggc aag ate agg 7296 
Asp Leu Lys He Arg Glu Asp Tyr Pro Ala Ser Leu Gly Lys He Arg 
2420 2425 2430 

cgc ate aag caa ate tct gtc acc etc cca get etc ttg ggt cca tac 734 4 
Arg He Lys Gin He Ser Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr 
2435 2440 2445 

caa gat gtc caa gca ate etc tec tac ggt gac aag get ggt ttg gcg 7392 
Gin Asp Val Gin Ala He Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala 
2450 2455 2460 

aac ggt tgc gag get ctt get gtc tct cat ggc atg aac gac tct ggt 74 40 
Asn Gly Cys Glu Ala Leu Ala Val Ser His Gly Met Asn Asp Ser Gly 
2465 2470 2475 

caa ttc caa ctt gac ttc aac gat ggc aag ttc etc cca ttc gag ggc 7488 
Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly 
2480 2485 2490 2495 

att gee att gac caa ggc acc etc acc etc tec ttc cca aac get tec 7536 
He Ala He Asp Gin Gly Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser 
2500 2505 2510 

atg cca gag aag gga aag caa gee acc atg etc aag acc etc aac gat 7584 
Met Pro Glu Lys Gly Lys Gin Ala Thr Met Leu Lys Thr Leu Asn Asp 
2515 2520 2525 

ate ate etc cac ate agg tac acc ate aag tgagctc 7621 
He He Leu His He Arg Tyr Thr He Lys 
2530 2535 
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